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- 1 - 
DESCRIPTION 

STREPTOCOCCUS PNEUMONIAS CAPSULAR 
POLYSACCHARIDE GENES AND FLANKING REGIONS 

5 

BACKGRODND OF THE INVENTION 

The present application is a continuation-in-part of 
0 co-pending U.S. Patent Application Serial No. 08/243,546, 
filed May 16, 1994. The entire text and figures of which 
disclosure is specifically incorporated herein by 
reference without disclaimer. The government owns 
certain rights in the present invention pursuant to grant 
5 number AI28457 from the Public Health Service and 

T32 AI 07 04 1-13 from the National Institutes of Health. 

1. Field of the Invention 

0 The present invention relates generally to the 

fields of bacterial capsule formation and the genes 
responsible for polysaccharide synthesis. More 
particularly, it concerns the genes and gene products 
that direct the formation of the Sftreptococcus pneumoniae 

5 serotype -specific polysaccharide capsule- The present 
invention also includes the identification of non-type 
specific gene sequences, flanking the capsule genes, and 
their use for the directed expression of specific 
serotypes of S. pneumoniae capsules. 

0 

2 . Description of the Related Art 

Infections due to S. pneumoniae are among the top 
ten causes of death in the United States. The normal 
5 populations most affected are young children and the 

elderly: pneumococcal pneumoniae, mainly affecting the 
elderly, causes >40,000 deaths per year among -500,000 
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cases and represents 60 to 80% of all bacterial 
pneumoniae; pneumococcal meningitis, with -4000 
cases/year, represents 11% of the total meningitis cases 
and has a fatality rate of >30% - greater than twice that 
5 of the two other leading causes, N. meningitidis and H. 
influenzae; bacteremia, usually following pneumoniae or 
meningitis, accounts for >35,000 cases per year (>30% 
fatal) ; and otitis media, the most frequent reason for 
pediatric office visits after well -child care, is caused 
10 by 3, pneumoniae in -50% of cases (ACIP, 1981; ACIP, 
1989; Austrian, 1984; Burke et al., 1971; Center for 
Disease Control, 1978; Johnston and Sell, 1964; Koch and 
Dennison, 1974) . 

15 Other populations have an even higher incidence of 

pneumococcal infections: approximately 30% of sickle cell 
children will have severe pneumococcal infections in the 
first three years of life and -35% of those will die 
(Overturf, et al., 1977; Powars, et al., 1981; Powars, 

20 1975); in both adults and children with HIV infections, 
S. pneumoniae is the major cause of invasive bacterial 
respiratory disease (Janoff et al., 1992). Patients with 
lymphomas, Hodgkins disease, multiple myeloma, 
splenectomy, and other debilitating diseases or 

25 immunologic deficiencies, are particularly susceptible to 
serious pneumococcal disease, as are those with chronic 
illnesses such as diabetes mellitus and heart disease* 
Furthermore, strains of 5. pneumoniae are emerging that 
harbor resistances to multiple antibiotics, including 

30 penicillin (Appelbaum, 1992; Jacobs et al., 1978; 
Landesman et al., 1982) . 

The polysaccharide capsule of S. pneumoniae is the 
major virulence determinant of this organism. Despite 
35 early studies of the genetics, pathogenesis, and 

immunology of capsular polysaccharides, it remains 
unclear why certain capsular types appear to have a 
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greater capacity to cause disease. Of the more than 80 
known capsular serotypes, 23 account for more than 90% of 
all pneumococcal infections. 

5 In children, the most prevalent types are 3, 6, 14, 

19, and 23, (Gray and Dillon, 1986), whereas in adults 
types 1, 3, 4, 6, 7, 8, 9, 12, 14, 18, 19, 23 prevail 
(Finland and Barnes, 1977) . In assays of opsono- 
phagocytosis (Branconier and Odeberg, 1982; Giebink et 

10 al., 1977; Knecht et al., 1970), complement activation 
and deposition (Pine, 1975; Gordon et al,, 1986; 
Hostetter, 1986; Stephens et al., 1977; Winkelstein et 
al., 1980; Winkelstein et al., 1976), and mouse virulence 
(Briles et al., 1992; Briles et al,, 1986; Knecht et aJ., 

15 1970; MacLeod, 1965; Walter et al., 1941; Yother et al., 
1982), levels of virulence have frequently been found to 
vary with the type of capsule expressed. For example, 
isolates expressing type 3, 4, and 19 capsules are highly 
resistant to phagocytosis, whereas those expressing types 

20 6A, 14, 23 and 37 are significantly less resistant 

{Branconier and Odeberg, 1982; Hostetter, 1986; Knecht et 
al., 1970; Wood and Smith, 1949). 

The importance of the capsule also results from the 
25 fact that anti-capsular antibodies are highly protective 
against infection. Nonetheless, the current 
polysaccharide -based vaccine is not particularly useful 
in some of the populations most affected by pneumococcal 
disease, e.g., the very young and the elderly, because of 
30 poor or absent immune response to polysaccharide 
antigens. 

The ability to produce improved vaccines and 
therapies for pneumococcal infections will most likely be 
35 the result of a better understanding of the basic 

pathogenic mechanisms of the organism. This understanding 
necessarily includes the genetic basis for the expression 
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of serotype- specif ic polysaccharides and the role of 
capsular type per se in pathogenesis. 

Some 85 different serotypes of Streptococcus 
5 pneumoniae, differing in the structure of the 

polysaccharide produced, have been identified (van Dam et 
al., 1990) . The basis for the emergence of new capsule 
types remains obscure. Whether influenced by mutation, 
recombination, or immune selection, genetic exchange of 

10 DNA is likely to have played a major role in the 
evolution of capsule types. It is knovm that 
pneumococcal capsule types can be changed through genetic 
transformation in vitro (Dawson, 1930; Dawson and Sia, 
1931; Langvad-Nielson, 1944; Avery et al., 1944). 

15 Epidemiological studies suggest that a significant degree 
of genetic exchange occurs in vivo (Grain et al., 1990; 
Coffey et al., 1991; Versalovic et al,, 1993). However, 
the mechanism by which capsule types are exchanged is not 
fully understood. 

20 

Extensive study was made of the genetics of capsular 
polysaccharide synthesis in S, pneumoniae using 
spontaneous mutants with defects in biosynthetic 
functions (Effrussi -Taylor, 1951; Ravin, 1960; Bemheimer 

25 and Wermundsen; 1972) . The results of these studies 
indicated that the genes for polysaccharide synthesis 
were closely linked and could be transferred as a unit 
during genetic transformation. A cassette- type model of 
capsule type change based on this data has been proposed 

30 (Taylor, 1949; Austrian et al., 1959; Bemheimer and 
Wermundsen, 1972) . According to the model, the 
type-specific genes for each capsule type would be 
present only in the genome of a strain of that capsule 
type and would show little homology to the type-specific 

35 genes of other capsule types. The type-specific genes 
would be located in homologous sites in the different 
chromosomes, clustered together between regions of highly 
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homologous flanking DNA. During transformation, 
recombination would occur in the flanking regions, 
resulting in the replacement of the recipient's 
type-specific region by that of the donor* 

5 

The clustering of capsule biosynthetic genes 
proposed by the model is analogous to the organization 
that has been observed in the gram negative bacteria 
Escherichia coli (K antigens) (Roberts et al., 1988), 

10 Neisseria meningitidis (Frosch et al., 1989), and 

Haemophilus influenzae (Kroll et al., 1989) • For each of 
these organisms, the type-specific region encoding 
biosynthetic functions (region 2) is flanked by highly 
homologous regions necessary for polysaccharide 

15 translocation (region 1) and modification (region 3) . 
Since H. influenzae, like S. pneumoniae, is naturally 
transformable, it has been proposed that capsule type 
change in this pathogen may occur by transformation with 
the type-specific gene cluster from a different serotype 

20 (Zwahlen et al., 1989). 

The one exception to the cassette model of capsule 
type change in S. pneumoniae is binary capsule formation. 
When non -encapsulated mutants have been transformed with 

25 chromosomal DNA from a strain of a different capsule 

type, most of the encapsulated transformants express the 
capsule type of the donor. However, at a frequency .10 to 
100 times lower, encapsulated transformants are obtained 
which express both capsules (Bernheimer and Wermundsen, 

30 1972) . In some of these transformants, the second set of 
capsule genes is closely linked to the original set. 
However, these strains are unstable, and, at high 
frequency, lose the ability to produce the original, 
capsule type. In binary strains in which the acquired 

35 capsule genes are unlinked to the original genes, binary 
capsule production is stable (Bernheimer and Wermundsen, 
1969) . Elucidation of the mechanism of binary capsule 
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type formation may be the key to understanding novel 
capsule type creation in 5. pneumoniae. 

It is clear that a better understanding of the 
5 genetics of capsular polysaccharide synthesis in 

Streptococcus pneumoniae is needed. The identification 
of type-specific capsular genes and the ability to 
transfer them, singly or as a gene cassette, to desired 
recipients, will elucidate the role of capsular types in 
virulence and allow easy identification of S. pneumoniae 
serotype. This ability will improve existing methods of 
diagnosis, identifying not only the presence of 3. 
pneumoniae but also the capsular type of the invading 
strain. Furthermore, it will allow construction of 
strains producing elevated levels of capsular 
polysaccharides for improved vaccines. 

SUMMARY OF THE INVENTION 

Capsular Polysaccharide Genes and Flanking Regions 

The present invention arises out of the discovery 
and sequence characterization of a gene family that 
confers on S. pneumoniae the ability, to produce type- 
specific capsules that define the serotype of the 
organism. The inventors refer to this gene family as the 
capsule synthesis or cps genes. These genes encode the 
various enzymatic functions of capsule synthesis and 
determine the particular structure of the capsule 
polysaccharide that is produced, and thereby define 
serotype. These genes, designated cpsB, cpsC, cpsE, 
cpsD, cpsS, cpsU, cpsM, tnpA, and plpA, map to specific 
DNA segments of sizes believed to range from about 0.5 kb 
to greater than 10 kb that appear to be type- specif ic for 
3. pneumoniae. Based upon the findings of the inventors, 
many type-specific genes may be distinguished on the 
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basis of restriction fragment length polymorphism (RFLP) 
analysis. 

The present invention also includes the discovery 
5 and sequence characterization of non*type specific DMA 
regions that flank both sides of the cps locus. These 
flanking DNA segments can be used to identify the 
location of cps flanking DNA from any strain of S, 
pneumoniae. This invention thus provides the ability to 
10 identify the cps locus within all strains and allows for 
the subsequent isolation and characterization of all 
genetic elements involved in determining S. pneumoniae 
serotype • 

15 The classification of S. pneumoniae strains is based 

on serological analysis of cell surface structures. 85 
distinct serotypes have been identified to date based on 
the formation of surface molecules. The formation of the 
cell surface of S. pneumoniae, and in particular its 

20 polysaccharide capsule, has, until now, eluded 
characterization at the molecular genetic level. 
However, studies of the biosynthesis of the 
polysaccharide capsule have revealed that at least some 
of the genes are likely to include enzymes involved in 

25 the preparation of the sugar backbones for incorporation 
into the saccharide backbone, such as UDP-glucose 
dehydrogenase . 

As mentioned above, these polymorphic "type- 
30 specific" sequence regions were found to be bounded or 
flanked by "non-type specific" regions having sequence 
elements that are apparently shared among the various 
subtypes. These regions, referred to as the left and 
right flanking regions, extend for, at least, 1 to 3 kb 
35 on either side of the cps genes. Thus, in type 3, the 
entire length of the capsule synthesis genes, including 
the non-type specific flanking regions, and any DNA. 
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sequences in between, is greater than 9 kb. In other 
capsule types the length is on the order of about 5 to 20 
kb, with the maximum length being related to the 
complexity of the polysaccharide encoded. Importantly, 
5 it is these flanking regions that allow recombination and 
integration of the type specific capsule genes to occur. 
Thus, when a selected cps gene or genes is positioned 
between the flanking regions, the resultant construct can 
be stably integrated into a 5. pneumoniae host. 

10 

The present discoveries concerning the cps gene 
regions, the identification of conserved flanking 
regions, and the construction of erythromycin resistant 
insertions in adjacent, non-type specific DNA elements, 

15 allows for the changing of capsular serotypes by 

"cassetting- in" the biosynthetic genes for different 
serotypes. This methodology may be employed for 
generating high yield capsular polysaccharide producing 
strains of different (heterologous) serotypes. For a 

20 high yielding strain, the existing serotype biosynthetic 
genes may be deleted and a different serotype's genes 
inserted ("cassetted in") . These other serotypes may 
come from strains where their natural genetic background 
gives only poor or moderate levels of capsular 

25 production. 

As used herein, the term "gene cassette" or simply 
"cassette", is intended to refer to any DNA segment 
flanked by one, or both, or part of, the cps flanking 
30 regions or a cps genetic element or DNA sequence which is 
found between the flanking regions. 

Prior to the present invention, the foregoing 
underlying mechanism of genetic recombination of the 
35 capsule synthesis genes was unknown, as were the specific 
sequences involved. A principal contribution of the 
present inventors is the specific characterization of the 
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individual genes and flanking regions, allowing for their 
manipulation and individual transfer to hosts. Now, 
discrete nucleic acid segments, or cassettes, containing 
a cps flanking region, gene or genes, can be readily 
5 prepared and easily used in transformation. 

Hybridization Probes and Primers 

Accordingly, in certain embodiments the invention 
10 concerns nucleic acid segments that hybridize with cps 

genes and/or flanking regions. The nucleic acid segments 
will generally be less than about 20 kb in length, and 
preferably less than about 15 kb in length, or even 10 
kb, and will comprise a non-type specific S. pneumoniae 
15 cps gene flanking region, and/or a type -specific cps 

gene, of sufficient length to allow hybridization with a 
pneumococcal cps flanking region and/or gene. Nucleic 
acid segments that are capable of hybridizing with the 5' 
flanking region, the 3' flanking region, to both flanking 
20 regions, to one or more of the genes designated cpsB, 

cpsC, cpaE, cpaD, cpsS, cpsU, cpsM, tnpA and 'plpA, and 
to one or more genes in combination with one or more 
flanking regions, are encompassed by the invention. 

25 Nucleic acid segments that include a first sequence 

portion capable of hybridizing to the 5' cps gene 
flanking region and a second sequence portion capable of 
hybridizing with the 3' cps gene flanking region form one 
aspect of the invention. Such nucleic acid segments may 

30 be combined with one or more cps genes and may be 

constructed to form a genetic unit in which the gene (or 
genes) is located between the two flanking regions. The 
gene(s) may be from a different cps serotype to the 
flanking regions or they may be from the same cps 

35 serotype to the flanking regions. Such genetic units may 
be termed cassettes and may also encompasses the form of 
a circular DNA segment, plasmid, cosmid, or phage. An 
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isolated fragment of DNA containing DNA such as might be 
generated by restriction digestion, ligation, and/or PGR 
methodologies would also be included. The nucleic acid 
segment, or cassette, may also include other regions of 
5 DNA, such as restriction or cloning sites, PGR primers, 
promoters, antibiotic resistance genes, and the like, as 
necessary or desired to make a functional genetic xinit. 

In order to have utility in connection with the 
10 present invention, all that is required is that such a 
nucleic acid segment or genetic unit include a region of 
sufficient complementarity and size to allow selective 
cross-hybridization with the target flanking region or 
gene sequence. 

In general, shorter and intermediate length nucleic 
acid fragments will be useful as hybridization probes and 
primers, and in particular, for use in PC31, where the 
primers will allow generation of the entire intervening 
cps sequence. 

Thus flanking region and gene fragments on the order 
of at least about 14-15, 20, 30, 40, 50, or 100 to 200, 
contiguous complementary nucleotides are contemplated, 
although sequences of 500, 1,000, or more, nucleotides in 
length may also be used. The DNA segments may, of 
course, be of any length within the stated ranges. This 
is the meaning of "about" in this context, with "about" 
meaning a range longer or shorter than the stated length, 
extending to the previously quoted longer and shorter 
lengths (with about 14 or so still being the minimum 
length) . The ranges thus encompass 1 to 4, 1 to 9, 1 to 
49, 1 to 99, and the like, nucleotides in length. 

Longer nucleic acid segments and fragments having on 
the order of up to 1,000, 2,000, 3,000, 5,000, 10,000, 
15,000 or longer in length will also have particular 
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utilities in addition to their function in hybridization 
embodiments. In particular, longer nucleic acid segments 
and fragments including selected cps coding sequences may 
be employed in the expression of recombinant proteins, 
5 and nucleic acid segments that include one or more cps 
gene sequences positioned between the flanking regions 
(cassette constructs) may be used in gene transfer 
embodiments as described above. The DNA segments may, of 
course, be of any length within the ranges stated above, 
10 so long as they function to achieve the desired effect. 
"About" in this context therefore indicates ranges of 
from 1 to 999, or 1 to 4,999, and the like, nucleotides 
longer or shorter than the stated length. 

15 Nucleic Acid and Axoino Acid Sequences 

Exemplary flanking regions sequences, as disclosed 
herein, are set forth in SEQ ID N0:1, SEQ ID NO: 2, SEQ ID 
N0:3, SEQ ID N0:4 and SEQ ID N0:6 (FIG. 7 and FIG. 8). 

20 The 5' cps gene flanking region is represented by SEQ ID 
N0:1, SEQ ID N0:2, SEQ ID N0:3 and SEQ ID N0:4. SEQ ID 
N0:1, SEQ ID NO: 2 and SEQ ID NO: 3 corresponds to regions 
of DNA sequenced in the upstream portion of the 5' 
flanking region and SEQ ID NO: 4 corresponds to the 

25 downstream portion of the 5' flanking region and is 

termed the "repeat" region. DNA between SEQ ID NO: 1, 
SEQ ID NO: 2, SEQ ID NO: 3 and SEQ ID NO: 4 are also part of 
the cps 5' flanking region. SEQ ID N0:1 is 300 
nucleotides in length and corresponds to the last 180 

30 nucleotides of cpsB and the 5' end of cpsC which begins 
immediately (FIG. 6A and FIG. 7). SEQ ID N0:2 is 261 
nucleotides in length and corresponds to a 3' end region 
of cpsC (FIG. 6B and FIG. 7). SEQ ID N0:3 is 262 
nucleotides in length and corresponds to part of cpsE and 

35 the 5' end of the repeat region (FIG. 6C and FIG. 7) . 

SEQ ID NO: 4 is 934 nucleotides in length, (nucleotide 1 " 
through 934, FIG, 6D and FIG. 8) . The 3' cps gene 
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flanking region is 1307 nucleotides in length, 1 through 
1307, SEQ ID NO: 6 {nucleotide 5886 through 7192, FIG, 61 
and FIG. 8) . 

5 The inventors have found that certain S. pneumoniae 

nucleotide sequences described in the scientific 
literature correspond to stretches of sequences from the 
flanking regions of the present invention. For example, 
Guidolin at al. (1994) have sequenced 6,322 base pairs of 

10 the 19F S, pneumoniae serotype cps locus. Sequence 
analysis indicated that this region contained six 
complete open reading frames and one partial, which they 
named cpslBfA to cpelSfG. Southern hybridization 
revealed that cpslBfA and cpslBfB were conserved in 12 

15 other S. pneumoniae serotypes tested, including serotype 
3. cpslBfC and cpslSfD also hybridized to Type 3 S. 
pneumoniae. The sequences for cpsB, cpsC and cpsE (SEQ 
ID N0:1, SEQ ID NO: 2 and SEQ ID NO: 3) , as disclosed 
herein, are about 99% identical to cpslBfB, cpelBfC and 

20 cpalBfD respectively (Guidolin et al. 1994). However, 
cpsE is truncated at the 3' end with respect to the 19F 
gene (lacks -280 nt) . The site of the truncation is 
adjacent to the region designated as the "repeat" 
sequence (SEQ ID NO: 4) . Based on blotting data (see 

25 Example 17) part of the repetitive element is in SEQ ID 
NO: 3. This sequence extends 132 nt upstream of the Sad 
site at the start of SEQ ID N0:4, as shown in PIG. 7» 
Although Guidolin et al. (1994) have sequenced this area 
in serotype 19f , they do not identify this sequence as 

30 being a gene flanking region and do not suggest its use 
as part of an S. pneumoniae capsular cassette. 

Garcia et al. (1993) localized a 781 bp EcoRV 
subfragment of a gene {cap3-l) that they proposed was 
35 involved in the transformation to a capsulated phenotype. 
The first 52 nucleotides of the 781 Garcia et al. 
sequence correspond to nucleotides 883 to 934 of SEQ ID 
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residues 49 through 209 of SEQ ID NO: 15. The inventors 
have found that the ^plpA located adjacent to the type 3- 
specific genes lacks about one third of its 5' end when 
compared to plpA genes located adjacent of the capsule 
5 genes of other types such as that used by Pearce et al. 
(1994), Neither Pearce et al. (1994) nor Pearce et al. 
(1993) identify this sequence as being involved in 
capsule synthesis in any way, nor do they suggest that it 
forms part of a common DNA flanking region. 

10 

Although certain stretches of nucleotide sequences 
may have been known in the art, their function, 
relationship to capsule synthesis and, particularly, 
their role as interchangeable flanking regions has not 

15 previously been described. An important feature of the 
invention is that the functional characterization of the 
flanking regions allows, for the first time, for the 
exchange of S. pneumoniae type-specific capsule genes to 
be manipulated and controlled. This is only possible in 

20 light of the inventors discovery of the conserved cps 

gene flanking regions. Nucleic acid segments, including 
cassettes and plasmids, that include both 5' flanking 
region sequences and 3' flanking region sequences are 
thus one preferred embodiment of the invention. PGR 

25 primers that have sequences corresponding to both 

flanking regions form another preferred embodiment of the 
invention. 

Encoded within the upstream 5' flanking region (SEQ 
30 ID N0:1, SEQ ID NO: 2 and SEQ ID NO: 3) are the partially 

sequenced genes cpsB, cpsC and cpsE which encode for CpsB 
(SEQ ID N0:7), CpsC (SEQ ID N0:8 and SEQ ID N0:9) and 
CpsE (SEQ ID NO:10) (FIG. 6A, FIG. 6B, FIG. 6C and FIG. 
7) . 

35 

The invention also includes other cps gene 
sequences, either alone or in combination with the 
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flanking region sequences described above. The type- 
specific portions of the polycistronic cps gene locus 
operon, as disclosed herein, are encompassed within 
nucleotides 1 through 4951, SEQ ID NO: 5 (nucleotides 935 
5 through 5885, FIG. 6E, FIG. SF, FIG. 6G, PIG. 6H, FIG. 7 
and FIG. 8) . Beside the open reading frames for the 
proteins, the sequences also contain putative promoters 
that direct the transcription of the genes of the cps 
locus. Other promoters, herein termed "recombinant 
10 promoters", may also be used to direct the expression of 
the cps genes in accordance with the invention. 



The genes encoded within SEQ ID NO: 5 and as 
disclosed herein, include cpsD, cpsS, cpsU, cpsM and part. 
15 of tnpA. 

CpsD is 1277 nucleotides in length, 1 through 1277 
of SEQ ID N0:5 (935 through 2211, FIG. €E and FIG. 8) . 

20 cpsS is 1267 nucleotides in length, 1277 through 

2543 of SEQ ID NO: 5 (2211 through 3477, PIG. 6F and 
FIG. 8) . 

cpsU is 1055 nucleotides in length, 2707 through 
25 3761 of SEQ ID NO: 5 (3641 through 4695, FIG. 6G and 
PIG. 8) . 

cpsM is 1194 nucleotides in length, 3758 through 
4951 of SEQ ID NO: 5 (4692 through 5885, FIG. 6H and 
30 FIG. 8) . 



CpsS is just downstream of CpsD, only 15 nucleotides 
separate a potential start codon for CpsS from the stop 
codon of CpsD. Other start codons for CpsS are at 
35 nucleotides 1311 and 1355 (SEQ ID NO: 5) . There is a 

large non-coding region between cpsS and cpsC7 (nucleotide - 
2543 through 2707, SEQ ID NO: 5) . Where as cpsD and cpsS 
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overlap by one nucleotide and cpsV and cpsM overlap by 4 
nucleotides. All four genes are transcribed in the same 
direction and cpsD and cpsS are in the same reading 
frame . 

5 

Encoded within the 3' flanking region (SEQ ID NO: 6) 
is the truncated sequence for plpA, designated 'plpA, 
which is 823 nucleotides in length, 484 through 1307 of 
SEQ ID NO: 6 (nucleotides 6370 through 7192, FIG. 61 and 
10 FIG. 8) . As mentioned above the 5' end is not present in 
the plpA gene of type 3 S. pneumoniae. A partial 
transposase sequence, tnpA, is contained between cpsM and 
'plpA. It is transcribed in the opposite orientation to 
all other genes described herein, and extends from 
15 nucleotide 480 through 1, SEQ ID NO: 6 to overlap with the 
cpsM gene nucleotide 4951 through 4902, SEQ ID NO: 5 a • 
total of 531 nucleotides (nucleotides 5836 through 6366, 
FIG, 6J and FIG, 8) . 

cpsD encodes a 394 amino acid sequence (SEQ ID 
NO: 11) which is homologous to that of the UDP-glucose 
dehydrogenase (HasB) from Streptococcus pyogenes. The 
deduced amino acid sequence encoded by cpsS predicts a 
protein of 416 residues (SEQ ID NO: 12) which is 
homologous to polysaccharide synthases. cpsU encodes a 
306 amino acid sequence (SEQ ID NO: 13) which is 
homologous to glucose -1 -phosphate uridylyltransferases 
from several other bacterial species. cpsM encodes a 397 
amino acid sequence (SEQ ID NO: 14) which has homology 
with phosphoglucomutases from several bacterial species. 
However, it lacks approximately 25% of the C-terminal 
present in other phosphomutases and may not encode a 
functional protein. 'plpA encodes a 274 amino acid 
sequence (SEQ ID NO: 15) and tnpA encodes a 177 amino acid 
sequence (SEQ ID N0;16) . 



25 



30 
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Thus, in certain particular embodiments the present 
invention concerns the individual cps genes, including 
segments encoding sequences corresponding to one or more 
of the cpsB, cpsC, cpsE, cpsD, cpsS, cpsU and cpsM genes. 
5 tnpA and plpA gene constructs, when combined with other 
cps genes and flanking regions, are also encompassed by 
the invention. In further embodiments, the invention 
concerns the respective proteins and polypeptides encoded 
by the cpaB, cpsC, cpsE, cpsD, cpsS, cpsU, cpsM, tnpA and 
10 plpA genes. The proteins, polypeptides and peptides of 
the invention may be used in a variety of embodiments, 
including, e.g., in immunological protocols to generate 
antibodies that may, in turn, be used in diagnostic 
embodiments to detect S. pneumoniae. 

15 

It should be noted that in the definition of the 
genes and proteins, the term "cps" is not used to 
indicate that the gene or protein concerned has a defined 
role in capsule synthesis in all cases. Rather, it 

20 indicates that the gene is located between the cps gene 
flanking regions, i.e., within the cps operon, and in 
close association with other cps genes. It should also 
be noted that, the "S. pneumoniae gene region" refers to 
all genetic elements associated with the cps genes, 

25 including genes incorporated within the flanking regions. 
A "genetic element" refers to any DNA that may encode for 
a protein or polypeptide, regardless of functionality. 
The utility of the cps genes remains that, e.g., they may 
be used in the same diagnostic manner to identify S. 

30 pneumoniae. 

While the present disclosure is exemplified in part 
through the cloning and sequencing of type 3 cps genes, 
the techniques are equally applicable to the cps genes of 
35 other capsule serotypes, including any one of the .85 

serotypes. For example, using the techniques developed 
for the characterization of the type 3 cps gene region, 
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the inventors have proceeded to characterize the 
restriction maps for the type 2 type 6B strains {Example 
16, FIG. 11) . As expected, the maps of the non-type 
specific flanking regions were found to be identical from 
5 serotype to serotype, whereas the maps for the cps gene 
regions themselves were serotype specific. 

Diagnostic Embodiments 

10 The cps flanking region and gene sequences, and the 

encoded proteins, may be employed in diagnostic 
embodiments. For example, the amount of S. pneumoniae, 
or S. pneumoniae serotype, present within a biological 
sample, such as blood, serum or a swab from nose, ear or 

15 throat, may be determined by means of a molecular 

biological assay to determine the level of nucleic acid 
complementary to the cps loci, or even by means of an 
immunoassay to determine the level of one of the 
polypeptides encoded by a gene from this locus. The cps 

20 locus DNA segment used in molecular biological assays may 
include the non-type specific segments such as the 5' and 
3' flanking regions, SEQ ID N0:1, SEQ ID N0:2, SEQ ID 
NO: 3, SEQ ID NO: 4, SEQ ID NO: 6 and any sequence in 
between, or the region encoding various polypeptides, 

25 such as those incorporated within SEQ ID N0:1, SEQ ID 
N0:2, SEQ ID N0:3, ID N0:5 and SEQ ID N0:6. 

In a molecular biological method for detecting 
S. pneumoniae, one would obtain nucleic acids from a 

30 suitable sample and analyze the nucleic acids to identify 
a specific nucleic acid segment complementary to the cps 
loci (whether type- or non-type-specific) , The nucleic 
acid segment will generally be identified by sequence, 
which method generally includes either; identifying a 

35 transcript with a corresponding or complementary 

sequence, e.g., by Northern or Southern blotting using an 
appropriate probe or; identifying a transcript with two 
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or more shorter primers and amplifying with PGR 
technology. 

Blotting Techniques 

5 

To detect S. pneumoniae, as may be used to identify 
a patient with otitis media, pneumococcal pneumonia or 
pneumococcal meningitis, using a method based upon 
hybridization and blotting techniques, one may use a 
10 probe with a sequence as set forth in SEQ ID N0:1, SEQ ID 
N0:2, SEQ ID N0:3, SEQ ID N0:4, SEQ ID N0:5 or SEQ ID 
NO: 6, including any sequence in between, or an equivalent 
thereof. This imparts an evident utility to the nucleic 
acid segments of the present invention. 

15 

To conduct such a diagnostic method, one would 
generally obtain sample nucleic acids from the sample and 
contact the sample nucleic acids with a nucleic acid 
segment corresponding to the cps loci disclosed herein, 
20 under conditions effective to allow hybridization of 
substantially complementary nucleic acids, and then 
detect the presence of any hybridized substantially 
complementary nucleic acid complexes that formed. 

25 The presence of a substantially complementary 

nucleic acid sequence in a sample, or a significantly 
increased level of such a sequence, in comparison to the 
levels in a normal or "control" sample, will thus be 
indicative of a sample that harbors S. pneumoniae. Here, 

30 substantially complementary nucleic acid sequences are 

those that have relatively little sequence divergence and 
that are capable of hybridizing to the sequences 
disclosed herein under standard conditions. 

35 Where a substantially complementary nucleic acid 

sequence, or a significantly increased level thereof, is 
detected in a clinical sample from a patient suspected of 
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having otitis media, pneumococcal pneumonia or 
pneumococcal meningitis, for example, this will be 
indicative of a patient that does have that particular 
disease. As used herein, the term "increased levels" is 
5 used to describe a significant increase in the amount of 
the cps loci nucleic acids detected in a given sample in 
comparison to that observed in a control sample, e.g., an 
equivalent sample from a normal healthy subject. 

10 A variety of hybridization techniques and systems 

are known that can be used. in connection with the S. 
pnexmoniae detection aspects of the invention, including 
diagnostic assays such as those described in Falkow 
et al., U.S. Patent 4,358,535. 

15 

In general, the "detection" of a cps locus is 
accomplished by attaching or incorporating a detectable 
label into the nucleic acid segment used as a probe and 
"contacting" a sample with the labeled probe. In such 

20 processes, an effective amount of a nucleic acid segment 
that comprises a detectable label (a probe) , is brought 
into direct juxtaposition with a composition containing 
target nucleic acids. Hybridized nucleic acid complexes 
may then be identified by detecting the presence of the 

25 label, for example, by detecting a radio, enzymatic, 
fluorescent, or even chemiluminescent label. 

Where one simply desires to distinguish S. 
pneumoniae DNA from the DNA of other bacteria, it is 

30 contemplated that the non-type specific region sequences 
may be employed as probes. However, where one desires 
to distinguish among different S. pneumoniae serotypes, 
it is contemplated that probes will include both type 
specific and non-type specific cps sequences. The type- 

35 specific sequences, being type specific, will selectively 
hybridize only to corresponding serotypes. Thus one can 
envision a battery of serotype specific cps nucleic acid 
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hybridization probes can be employed to distinguish and 
identify serotype DNA samples. In these instances , it is 
not believed to be necessary to employ restriction enzyme 
digestion prior to hybridization, but this can be 
5 employed where desired. Alternatively, only one non-type 
specific sequences may be employed as a "universal probe" 
that allows detection of restriction fragment length 
polymorphisms (RFLPs) • Typically for RFLP detection, one 
will employ the more specific hybridization technique of 
10 Southern analysis wherein restriction digestion of the 

unknown or target DNA is carried out using an enzyme that 
will cleave either within or surrounding the cps gene 
region (FIG. 4 and FIG. 11) show restriction maps for 
several of the serotype cps gene regions) . 

15 

Many suitable variations of hybridization technology 
are available for use in the detection of nucleic acids, 
as will be known to those of skill in the art. These 
include, for example, in situ hybridization, Southern 

20 blotting and Northern blotting. In situ hybridization 

describes the techniques wherein the target nucleic acids 
contacted with the probe sequences are those located 
within one or more cells, such as cells within a clinical 
sample or even cells grown in tissue culture. As is well 

25 known in the art, the cells are prepared for 

hybridization by fixation, e.g. chemical fixation, and 
placed in conditions that allow for the hybridization of 
a detectable probe with nucleic acids located within the 
fixed cell. 

30 

Alternatively, target nucleic acids may be separated 
from a cell or clinical sample prior to contact with a 
probe. Any of the wide variety of methods for isolating 
target nucleic acids may be employed, such as cesium 
35 chloride gradient centrif ugation, chromatography (e.g., 

ion, affinity, magnetic), phenol extraction and the like. 
Most often, the isolated nucleic acids will be separated. 
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e.g., by size, using electrophoretic separation, followed 
by immobilization onto a solid matrix, prior to contact 
with the labelled probe. These prior separation 
techniques are frequently employed in the art and are 
5 generally encompassed by the terms "Southern blotting" 
and "Northern blotting" . Although the execution of 
various techniques using labeled probes to detect cps 
locus DNA or RNA sequences in clinical samples are well 
known to those of skill in the art, a particularly 
10 preferred method is described in detail herein, in • 
Example 4. 

PCR Techniques 

15 To detect 5. pneumoniae, using a method based upon 

PCR technology of U.S. Patent 4,603,102 (incorporated 
herein by reference) , one may also use one or more probes 
with a sequence as set forth in SEQ ID N0:1, SEQ ID NO: 2, 
SEQ ID NO: 3 and SEQ ID NO: 4 (including sequence in 

20 between), SEQ ID NO: 5 or SEQ ID NO: 6, or an equivalent 
thereof . 

To conduct such a diagnostic method, one would 
generally obtain sample nucleic acids from a suitable 

25 source and contact the sample nucleic acids with two 

probes or primers corresponding to the cps loci disclosed 
herein, under conditions which allow for hybridization 
and polymerization to occur, A pair of probes, one 
corresponding to the 5' flanking region and the other 

30 corresponding to the 2' flanking region, would be 

sufficient to detect the presence of 3. pnexmonxae in a 
sample and may even be used to indicate the amount of 
bacteria present. Furthermore the size of the isolated 
PCR product, when separated by any of the methods as 

35 described above, may be sufficient to identify the S. 

pneumoniae serotype. Alternatively the PCR product may 
be digested with one or more restriction enzymes, which 
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would enable individual serotypes to be distinguished, 
using the same principle as RFLP. 

In further embodiments, it may be desired to employ 
5 other probes corresponding to type specific or non-type 
specific regions. A battery of serotype specific probes, 
probes corresponding to type specific DNA regions, may be 
employed in individual reactions, with a universal probe, 
a probe corresponding to non-type specific regions. The 
10 size of PGR products, with or without prior digestion 

with restriction enzymes, would distinguish and identify 
the S. pneumoniae serotype. 

Kits 

Kits for use in Southern and Northern blotting or 
PGR technology, to identify S. pneumoniae and/or 
individuals having, or at risk for developing, otitis 
media, pneumococcal pneumoniae or pneumococcal 
meningitis, are also contemplated to fall within the 
scope of the present invention. Such kits will comprise, 
in suitable container means, cps nucleic acid probes; 
also, generally, unrelated probes for use as controls; 
and optionally, one or more restriction enzymes. 

characterization of Streptococcus pnexmoniae Serotypes 

In another embodiment of the invention, the non-type 
specific DNA may be used, to isolate and characterize the 
30 type-specific DNA sequence for all or any strain of S. 
pneumoniae. Consequently, the most suitable probes for 
diagnosing S. pneumoniae infection, for use in any 
molecular biological technique, could be found. 

35 The present invention identifies the common flanking 

DNA which may be used in hybridizations to target the 
location of the type-specific cps genes for any strain of 
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3. pneimonxae. Once this location has been identified 
the cps genes may then be isolated and characterized by 
the use of conventional techniques, such as; chromosome 
crawling, PGR technology, cloning and DNA sequencing, all 
5 are disclosed herein. As mentioned above, this in turn 
would enable suitable probes from each S, pneumoniae 
strain to be chosen and then used for, diagnostic and 
research purposes. Furthermore, the genetic elements 
involved in determining S. pneumoniae serotype may be 
10 elucidated, and an understanding of their effect on 
virulence and evolutionary role may be achieved. 

In still more particular embodiments, the invention 
concerns the preparation and cloning of entire cps gene 

15 regions encoding one or more specific cps genes of a 

particular serotype, positioned within a "cassette" for 
ease of manipulation, e.g. in plasmid preparation or host 
cell introduction, etc. Thus, cps gene cassettes in 
accordance with the present invention will typically 

20 include left and right hand flanking regions to allow 
homologous recombination in S. pneumoniae host cells. 

A preferred method for preparing cps gene cassette 
is through the application of PGR technology wherein left 

25 and right hand primer corresponding to left and right 

hand flanking region are employed to amplify the cps gene 
coding regions. Of course, the primers employed will 
typically include at their termini appropriate 
restriction enzyme site. Thus, the resulting cassette 

30 will preferably include at each terminus a restriction 

enzyme site of choice. The site, of course, will depend 
upon the vector that is ultimately employed for 
manipulation or transformation but may be the specific 
cleavage site for one or more of the restriction enzymes 

35 shown in FIG. 4 and FIG. 7 or listed in Table 1. The 

list is, of course, exemplary, and any restriction enzyme 
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could be employed, as is well known to those of skill in 
the art. 

The present invention also contemplates more 
5 traditional approaches to cps gene regions cloning, such 
as by partial fragmentation of S. pneumoniae DNA followed 
by cloning into a recombinant cloning host, such as E. 
coli, and screening by hybridization and antibiotic 
selection (using a selection marker fo\ind on the plasmid 

10 or other vector employed for cloning) . Of course, if the 
cloning host is not a S. pneumoniae host, one may employ 
either type specific or non-type specific cps gene 
sequences for screening. In these cases, cassettes that 
are developed may include enzyme restriction sites 

15 naturally found to occur in the flanking regions, such as 
an SphJ site. 

It is contemplated that virtually any type of host 
cell may be employed in connection with the present 

20 invention, depending on the particular application. For 
example, where one simply desires to manipulate cps gene 
sequences, any acceptable host may be employed, such as 
an E, coli, or even an appropriate eukaryotic host where 
desired. However, where one contemplated producing 

25 capsule polysaccharides, one will desire to employ a gram 
positive host such as bacillus, staphylococcus or 
streptococcal hosts. Particularly preferred will be^ 
S. pneumoniae host, in that it is contemplated that such 
hosts will be more readily amenable to manipulation to 

30 increase capsular polysaccharide production. 

The inventors contemplate that a particular 
application, therefore, will be the use of recombinant 
hosts not only for the preparation and manipulation of 
35 cps gene sequences, but also in the large scale 

production of capsule saccharides and polysaccharides. 
These polysaccharides are useful for the production of 
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antigenic haptens and epitopes for use in the production 
of immunogens. Haptens and epitopes are described herein 
as portions of a molecule against which an immune 
response is directed. In conjunction with an molecule 
5 that elicits an immune response, that is an immunogen, an 
hapten- immunogen complex is able to elicit an immune 
response . 

Generally speaking, where capsule production is 
10 required/ one will employ a S. pneumoniae host cell into 
which a selected cps gene region is introduced or 
"cassetted in". First, a DNA segment that includes the 
selected S. pneumoniae cps gene(s) flanked by sufficient 
S. pneumoniae flanking regions to allow homologous 
15 recombination in the 5. pneumoniae host is identified. It 
is contemplated that flanking regions on the order of 0.1 
to 1 kb will be sufficient to allow recombination to 
occur. Once an appropriate DNA segment is introduced 
into the S. pneumoniae host, either as genomic DNA or as 
20 a recombinant vector (plasmid) , transformed host 
expressing the introduced cps gene are selected. 

DNA may be introduced into a suitable host by a 
variety of mechanisms, including natural transformation 

25 of S. pneumoniae, calcium mediated transformation or 
electroporation of E. coli. A particularly preferred 
method of bacterial transformation includes the steps of 
making an S. pneuimoniae coitpetent for transformation by 
growth in Todd Hewitt broth supplemented with calcium and 

30 bovine serum albumin. Alternatively electroporation may 
be employed, one makes the bacterial cells, such as the 
E. coli strain LE392, competent in 10% glycerol in water, 
adds the DNA, and electroporates the cells. 

35 Where the cps gene DNA segment is introduced in the 

form of a plasmid, it would.be preferable, at certain 
times, to employ a plasmid that is free of a 5. 
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pneumoniae origin of replication. Thus, virtually any 
traditional plasmid (which are designed, e.g. for gram 
negative hosts such as E, coli) may be employed. The 
reason is that where the plasmid is free of a S. 
5 pneumoniae origin of replication only those clones that 
have successfully undergone homologous recombination with 
the recombinant cps gene region will be detected. Stated 
coiother way, in this case, there is no requirement of a 
3. pneumoniae origin of replication in order for 
10 homologous recombination to occur, suid thus homologous 
recombinants are inherently selected for using such a 
cloning technique. 

Alternatively it may be simpler to introduce the 
15 cassette into pneumoniae on a replicating plasmid with 
a 3. pneumoniae origin of replication. In this way, 
higher levels of polysaccharide production as a result of 
the elevated copy number of the plasmid (10 to 20) as 
opposed to the low copy number of the chromosome (1 to 2) 
20 would be achieved and homologous recombination need not 
occur • 

Additionally, it is contemplated that there will be 
some advantage to employing as the starting host a 3. 

25 pneumoniae strain that is a high producer of its own 

inherent cps gene. These high producers will necessarily 
include the genetic environment to support high 
production of the newly introduced cps complex, and thus 
will likely be ideal hosts for such production. To 

30 achieve such recombinants, all that is required is that 

the heterologous gene containing the flanking regions, or 
a genome containing the flanking regions, is introduced 
into the host cell, and resultant recombinants wherein 
the homologous gene has been replaced is selected. 

35 

Although the invention contemplates in particular 
embodiments the introduction of an entire recombinant cps 
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gene region where capsule saccharide synthesis is 
desired, the invention also contemplates the introduction 
of one, two, three or so of the individual cps genes. It 
is contemplated that one or two genes, such as those that 
5 control the biosynthesis of sugar precursors may be 
sufficient to, in and of themselves, confer serotype 
specific saccharides production on the selected host, or 
can in of themselves, upregulate capsule production by 
existing cps genes of the host. 

10 

BRIEF DESCRIPTION OF THE DRAWINGS 



The following drawings form part of the present 
specification and are included to further demonstrate 
15 certain aspects of the present invention. The invention 
may be better understood by reference to one or more of 
these drawings in combination with the detailed 
description of specific embodiments presented herein. 

20 FIG, 1, ELISA for the detection of type 3 capsule. 

Wells of microtiter dishes were coated with crude 
extracts of capsule material- Type 3 capsule was 
detected with the monoclonal antibody 16. 3 (Briles et 
al,, 1981a). The type 2 strain D39 served as a negative 

25 control. Measurements were made in triplicate, and error 
bars represent standard errors. Values were standardized 
to protein content. Genotypic designations were based on 
linkage as determined by transformation mapping. 

30 FIG. 2. Insertion-duplication restoration. The 

cloned fragment and the homologous fragment in the 
chromosome are represented by the open boxed regions. 
The dark block represents the mutation in the chromosome 
of the mutant strain. Insertion of the plasmid clone 

35 into the pneumococcal chromosome results in a duplication 
of the homologous fragment with the plasmid inserted in 
between. If the recombination occurs to the left of the 
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mutation as shown here, a wild type, full-length copy of 
the gene is created and fiinction is restored. The 
plasmid clone leading to restoration spontaneously 
excises at low frequency through homologous recombination 
5 and can therefore be easily recovered by transformation 
to E. coll. pJD330 contains a 2.4 kb Sau3Al fragment, 

FIG. 3A. Repair of capsule mutations by double 
crossover recombination event. Mutants were transformed 
with plasmid subclones of pJDSSO and no selection for 
Ery^ was made. The box in the chromosome represents the 
region in the mutant chromosome homologous to the 
fragment cloned in pJDBBO. The plasmid represents a 
subclone of pJD330 capable of restoring encapsulation in 
the mutant strain. 

FIG. 3B* Deletion analysis to locate the site of 
the cpsAl, capD4, and Rxl mutations. The mutations were 
mapped by transformation with plasmid clones containing 
the indicated fragments and screening for the mucoid 
colony phenotype. Identical fragments repaired the 
cpsAl, capD4, and Rxl mutations. No selection was made 
for insertion of the plasmids, thus these numbers 
represent double crossover events. The actual 
frequencies of repair, shown for the cpsAI- containing 
mutant, are mainly a reflection of the transformation 
efficiency of the recipient. No encapsulated 
transformants were obtained when pJY4163 (no insert) was 
used for transformation. Restriction sites are: M, ATunl; 
P, PstI; Pv, PvuII; R, J^sal; S3, Sau3A I; Ss, Sspl; Xb, 
Xbal. 

FIG. 4. Physical and genetic map of the type 3 
capsule region of S. pneumoniae WU2. The restriction map 
was developed by probing chromosomal digests of mJ2 with 
pJD330 and pJD366 and by probing chromosomal digests of 
JD770 with pJD330 and pJY4163. The location of primary 
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clones pJD330 and pJD366 are indicated above the map. 
Subcloned fragments used to target insertion-duplication 
mutations are indicated below the map. Insertion of 
plasmids containing shaded fragments led to loss of 
5 capsule production. + or - at the end of a fragment 
indicates the presence or absence of transcription 
detected at that point in the chromosome. Insertions of 
pJD330 and pJD366 are in the orientation to detect 
transcription to the left. Clones pJD351 and pJD364, 

10 which contain the pJD330 and pJD366 fragments, 

respectively, in the opposite orientation, were also used 
for the transcription studies. The other plasmids used 
for insertions were pJD356, pJD337, pJD369, pJD359, 
pJD362, pJD3 61; pJD357, and pJD374, in the order shown, 

15 The genes indicated by genetic data or suggested by 
transcription determinations were drawn based on 
preliminary sequence information. The cpsDSVM 
designations are based on probable functions, as 
described in the text. Restriction sites are Bg, Bglll; 

20 Ev, ScoRV; H, Hindlll; Ha, Haelll; M, Muni; P, PstI; Pv, 
Pvull; R, Rsalf S, Sad; S3, 5au3A I; Sa, 5all; Sp, Sphl; 
X, Xbal. Restriction sites are not necessarily unique to 
the entire region. 

25 FIG. 5. Schematic representation of the capsule 

regions in these strains. Insertions in JD871 and JD872 
result from incorporation of pJD366. The insertion in 
JD803 is pJD330. The shaded square symbol represents type 
2 specific DNA; the open square symbol represents type 3 

30 specific DNA; the hatched square symbol represents 

flanking DNA common to both type 2 and type 3 and; the • 
black square symbol represents pJY4163 or pJY4164. The 
locations of the probes used are indicated below the map. 

35 FIG. 6A. DNA sequence of the 5' flanking region 

including partial sequence of cp33B and cps3C (SEQ ID 
N0:1) encoding for SEQ ID NO: 7 and SEQ ID NO: 8. 
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FIG. 6B. DNA sequence of the 5' flanking region 
including partial sequence of cpaBC (SEQ ID N0:2) , 
encoding for SEQ ID NO: 9. 

5 FIG. 6C. DNA sequence of the 5' flanking region 

including partial sequence of cps3E (SEQ ID NO: 3), 
encoding for SEQ ID NO: 10. 

FIG. 6D. DNA sequence of the "repeat" upstream 
10 flanking DNA (SEQ ID N0:4) • 

FIG. 6E. DNA sequence of the region containing. 
CP83D (nucleotides 1 through 1277, SEQ ID N0:5) . The -35 
and -10 hexatners of potential a-70 type promoters 

15 upstream of cps3D are underlined and labeled above the 
sequence. Putative ribosome binding sites are 
underlined. The precise locations of endpoints of 
insertion mutations shown in FIG. 7 are indicated by 
triangles below the sequence and are labeled with the 

20 name of the strain containing the given mutation. The 
locations of spontaneous mutations in cpsSD are labeled 
with the sequence of the mutation and the name of the 
strain containing the mutation. The sequence from the 
PvuII-SspI fragment of A66R2 began at nucleotide 1921, 

25 thus it is possible that additional mutations are present 
from the PvuII site to this point. Selected restriction 
sites are shown. 

FIG. 6F. DNA sequence of the region containing 
30 CP83S (nucleotides 1277 through 2543, SEQ ID N0:5) . The 
precise locations of endpoints of insertion mutations 
shown in FIG. 7 are indicated by triangles below the 
sequence and are labeled with the name of the strain 
containing the given mutation. 



35 



FIG. 6G. DNA sequence of the region containing 
cps3U (nucleotides 2707 through 3806, SEQ ID N0:5). The 
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-35 and -10 hexamers of potential a- 70 type promoters 
upstream of cps3U are underlined and labeled above the 
sequence. A short region of dyad symmetry just upstream 
of cps3U is overlined. Putative ribosome binding sites 
5 are underlined. The precise locations of endpoints of 
insertion mutations shovm in FIG. 7 are indicated by 
triangles below the sequence and are labeled with the 
name of the strain containing the given mutation. 
Selected restriction sites are shown. 

10 

FIG. 6H. DNA sequence of the region containing 
cp83M (3746 through 4951, SEQ ID N0:5) with corresponding 
amino acid sequences. Note that the first line of FIG. 
6H overlaps the last line of FIG. 6G. 

15 

FIG. 61. DNA sequence of the region containing the 
3' flanking region including ^plpA {SEQ ID NO: 6) with 
corresponding amino acid sequences (SEQ ID NO: 15). 

20 FIG. 6 J. The DNA sequence for a partial transposase 

A, tnpA, located between 'plpA and overlapping cpaM, The 
open reading frame is in the opposite orientation 
starting at nucleotide 6366 through 5836 of FIG. 61. 

25 FIG. 7. Map of the chromosomal region involved in 

the biosynthesis of S, pneimoniae type 3 capsular 
polysaccharide. Triangles indicate the endpoints of 
insertion mutations. Filled triangles represent 
insertions which resulted in loss of capsule production . 

30 Open triangles represent mutations which did not result 
in loss of capsule production. Restriction enzyme sites 
are: Bg, Bg^ill; Ev, £?coRV; H, Hindlll; P, PstI; Pv, 
PvuII; S, Sad; Sa, Sail; Sp, Sphl . Also included is a 
diagram showing the position of SEQ ID N0:1, SEQ ID NO: 2 

35 and SEQ ID NO: 3, including the corresponding amino acid 
sequences SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9 and SEQ 
ID N0:10. 
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FIG. 8. Map of the chromosomal region involved in 
the biosynthesis of S. pneumoniae type 3 capsular 
polysaccharide showing the relationship between SEQ ID 
NO: 4, SEQ ID NO; 5 and SEQ ID NO: 6, and DNA sequence as 
5 described in FIG. 6D through FIG. 6 J. 

FIG. 9. Location of insertion mutations in the 
type 3 -specific region of the S. pnetunoniae WU2 
chromosome. Schematic illustration of the insertions. 
The schematic was derived from Southern blot analysis 
such as that shown in FIG. lla and FIG. lib. The ability 
of the strains to produce type 3 capsule is indicated. 
Restriction sites are: F, Fspl; H, Hindi II; K, Kpnl; Ms, 
Mad} V, PstI; Pv, PvuII, X, Xbal. 

PIG. 10. Biosynthetic pathway for type 3 capsular 
polysaccharide. The functions of the proteins encoded by 
the type 3 -specific genes, based on homologies, genetic, 
and biochemical data are shown. Additional ftmctions may 
be necessary for capsule transport or attachment. 

PIG. 11. Chromosome maps of the capsule regions in 
strains of types 2, 3, and 6B. The £facI-HindIII fragment 
(pJD377) from type 3 used for the probe is shown below 
25 the maps. Restriction sites are Bg, B^rlll; P, Fspl; H, 
Hindlll; S, Sail; Sac, Sad; Sp, Sphl. 

FIG. 12. Production of type 3 capsule. Buoyant 
densities of parents and derivatives expressing the type 

30 3 capsule. Densities were determined by centrifugation 
on 0 to 50% Percoll gradients for 30 min at 8, 000 x gr. 
Samples were grown in duplicate, and the density of each 
sample was determined in duplicate gradients. The 
results : shown were obtained with bacteria grown on solid 

35 medium. Identical results were obtained from growth in 
liquid culture. 
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PIG. 13. Total capsule production. Triplicate 
cultures of each strain were grown to an OD^qq of -0.5. 
Duplicate wells of polyvinyl plates were coated with 
either supernatant fluids or cell sonicates. Total 
5 capsule contents of the type 3 parent and the derivatives 
were determined by using a monoclonal antibody to type 3 
capsule. See Table 2, footnote d, for explanation of 
strain designations. 

FIG. 14A. All studies were done in BALB/ByJ mice. 
See Table 7, footnote d, for explanation of strain 
designations. Virulence of type 2 derivatives. The 
parental strains JD770 (3/3) and D39 (2/2) and the 
derivatives JD803 (2/3) and JD804 (2/3) had similar LD^qS 
(50 to 75 CPU) . For time-to-death studies, groups of 
mice were infected i.p. with doses approximately 5- to 
20-fold above the LD^q. The observed times to death were 
not related to the dose received. Each circle represents 
an individual mouse. The median times to death for D39 
(2/2) and for derivatives »JD803 (2/3) and JD804 (2/3) 
were 31.5 and 33.0 h, respectively (not significantly 
different) . All three values differed significantly (P < 
0.005) from that of the type 3 parent JD770 (52h) . The 
values for JD803 and JD804 did not differ from each 
other. 

FIG. 14B. All studies were done in BALB/ByJ mice. 
See Table 7, footnote d, for explanation of strain 
designations. Virulence of the type 5 derivatives. Mice 
30 were infected i.v. with doses of 10^ to 10^ CPU (10-fold 
increments) of each type 3 derivative. Doses of 10^ to 
10^ and 10^ to 10^ CPU were used fro the parental strains 
DBL5 (5/5) and JD770 (3/3), respectively. The total- 
number of mice used per dose is listed beside each datum 
35 point. TK5010* represents the combined data for TK5010 
(5/3), TK5011 (5/3), and TK5012 (5/3). The derivatives 
did not differ from each other but did differ 
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significantly from the parental strains JD770 (P < 
0.0005) and DBL5 (P < 0.0001) • 



FIG, 14C. All studies were done in BALB/ByJ mice. 
5 See Table 7, footnote d, for explanation of strain 

designations. Virulence of type 6B derivatives. Mice 
were infected i.p. with doses of 10^ to 10^ CFU of the 
type 6B derivatives. Doses of 10^ to 10^ CFU cuad of 10"^ 
to 10^ CFU were examined for the parent strains JD770 

10 (3/3) and DBLl {6B/6B) , respectively. The total number 
of mice used per dose is listed beside each datum point. 
TK3028* represents the combined data for TK3026 {6B/3) 
and TK3028 (6B/3) , which did not differ fro each other. 
However, these strains did differ significantly from the 

15 parental strains JD770 (3/3) (P < 0.003) and DBLl (SB/SB) 
(P < 0.0005) . 

FIG. 15A. Model for the transfer of type-specific 
genes. Cassette type -recombination. Replacement of the 

20 recipient's type-specific genes with those of the donor 
results from homologous recombination between common 
regions that flank the type-specific cassettes. The open 
elipsoid symbol represents sequence containing repeated 
element; the black oblong symbol represents common DNA 

25 upstream of type-specific cassettes and; the open oblong 
symbol represents common DNA (including plpA) downstream 
of type-specific cassettes. 

FIG. 15B. Model for the transfer of type-specific 
30 genes. Binary encapsulation by recombination involving 

homology at only one end. Integration at one end of the 
type-specific cassette would occur via homologous 
recombination through the repeated element. Integration 
at the other end would result from an apparent 
35 illegitimate recombination- Linkage of the two type- ■ 
specific cassettes would result if the integration 
occurred in a repeat element in or closely linked to the 
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recipient's capsule genes. Symbols represent the same as 
in FIG. 15A. 

FIG. 15C. Model for the transfer of type-specific 
5 genes. Binary encapsulation via a transposition-like 

event. Type-specific cassettes flanked by the repeated 
element would resolve out of the chromosome and be 
transferred to recipient cells as circular intermediates. 
Recombination into the recipient chromosome could occur 

10 at a repeat element unlinked (as shown) or linked to the 
recipient's type-specific genes- Transfer of linear DNA 
could also yield binary strains as a result of 
recombination between the two repeat elements that flank 
the type -specific genes and two repeat elements that are 

15 closely linked in the recipient chromosome. Symbols 
represent the same as in FIG. ISA. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

20 The present inventors have determined a genetic and 

physical map that encompasses the region responsible for 
the synthesis of the polysaccharide capsule of 
pneumoniae. The polysaccharide capsule of S, pneumoniae 
is a potent defense against the immune response of the 

25 host organism and is directly involved in bacterial 

virulence. The capsule locus, cps, is basically composed 
of two functional regions: a central region that contains 
the genes responsible for capsular biosynthesis and is 
described herein as the type-specific region, and the 

30 non-type specific regions that flank the central 
biosynthetic type-specific region. 

S. pneumoniae has evolved a complex 'antigenic 
shift' mechanism that allows the bacteria to evade the 
35 host immune system. The antigenic shift of S. pneianoniae 
occurs via homologous recombination of a type -specific 
cassette that is replaced through natural transformation. 
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3, pnexmoniae is naturally competent allowing for the 
acquisition of chromosomal DNA from exogenous sources, 
such as other S. pneumoniae. Disclosed herein is " 
evidence identifying the non-type specific regions as 
5 being responsible for providing the sequence identity 
that allows for homologous recombination cross-over 
points. 

The present inventors have identified and cloned the 

10 region of the 5. pneumoniae chromosome that contains 
genes involved in the production of type 3 capsular 
polysaccharide, and that is specific to type 3 strains. 
They have also cloned approximately 1-3 kb of DNA 
flanking both sides of this region and found it to be 

15 common to all capsular serotypes examined. A genetic and 
physical map of the region is presented in FIG, SA, FIG. 
SB, FIG. 6C, FIG. 6D, FIG. BE, FIG. 6F, FIG. 6G, FIG. 6H, 
FIG. 61/ and FIG. 6 J. A simplified version of which is 
shown in FIG. 7 and FIG. 8. . The sites of insertion 

20 mutations made within the region are shown in FIG. 7. 

The regions found by hybridization studies to be specific 
to type 3 or common to all capsule types are also 
indicated in FIG. 7. The cloning of the upstream region, 
creation of insertion mutations, sequence analysis of the 

25 region, hybridization analyses using the upstream region, 
and an in vitro assay of type 3 capsule polymerization 
are described in the following^ examples. 

The DNA sequence of the region containing the nine 
30 genes; cps3B, cpsSC, cps3E, cps3D, cps33, cpsSV, cpsSM, 
'pIpA, tnpA and the flanking DNA was determined and is 
presented in FIG. 6A, FIG. 6B, FIG. 6C, FIG. 6D, FIG. SB, 
FIG. 6E, FIG. 6F, FIG. 6G, FIG. 6H, FIG. 61, and FIG. 6J 
(SEQ ID N0:1, SEQ ID N0:2, SEQ ID N0:3, SEQ ID NG:5 and 
35 SEQ ID NO: 6) along with the deduced amino acid sequences 
(SEQ ID:N0:7, SEQ ID N0:8, SEQ ID N0:9, SEQ ID NO:10, SEQ 
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ID N0:11, SEQ ID N0:12, SEQ ID N0:13, SEQ ID N0:14, SEQ 
ID N0:15 and SEQ ID N0:16) . 

Based on genetic, molecular, and biochemical data 
5 the inventors have been able to assign putative functions 
to the type-specific genes in the pathway for type 3 
capsular polysaccharide biosynthesis. Two of the genes, 
cps3D and cps3S, are required for capsule synthesis. 
There is substantial evidence to indicate that cps3D 

10 encodes UDP-glucose dehydrogenase. Described herein is 
genetic evidence to indicate that several mutations 
causing the capsule -negative phenotype are located in the 
gene for UDP-glucose dehydrogenase. The predicted amino 
acid sequence has characteristics consistent with this 

15 function. Cps3D shows a high degree of homology to HasB, 
which is the UDP-glucose dehydrogenase of S. pyogenes 
(Dougherty & van de Rijn, 1993) . Within CpB3D are 
sequences homologous to the active site and the NAD- 
binding site in known UDP-glucose dehydrogenases, it is 

20 not possible to perform the standard UDP-glucose 

dehydrogenase assay on extracts of S. pneumoniae due to 
the presence of a NADH oxidase, which copurifies with the 
enzyme (Smith, et al., 1960; Smith, et al., 1958). 
However, extracts from cps3D mutants could synthesize 

25 type 3 capsule in vitro if supplied with UDP-glucuronic 
acid, i.e., they lacked the ability to convert UDP- 
glucose to UDP-glucuronic acid and thus lack UDP-glucose 
dehydrogenase activity. 



30 Cps3S is a new member of a family of polysaccharide 

synthases. All of these polysaccharide synthases for 
which the structures of the polysaccharides are known 
produce /(?. (1-4) linked polysaccharides. Thus, it is 
possible that Cps3S forms the jS (1-4) linkage in the 

35 disaccharide cellobiuronic acid (glcA /3 (1-4) glc) , and 
that -a second enzyme is required to polymerize (i.e., 
create the iS (1-3) linkages) the disaccharides into the 
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full length polysaccharide. However, HasA, the enzyme 
most closely related to Cps3S, creates both linkages, a P 
(1-4) and a ^ (1-3)/ in the production of hyaluronic acid 
capsule (DeAngelis, et al . , 1993). HasA has recently 
5 been shown to be sufficient for hyaluronic acid synthesis 
in heterologous bacteria, given the nucleotide sugar 
substituents (DeAngelis, et al,, 1993a). Because the 
inventors did not find another required enzyme in the 
type 3 -specific region, Cps3S, like HasA, may synthesize 
10 the polysaccharide by monomer addition. 

Neither cpsSU nor cpsSM appears to be required for 
type 3 capsule synthesis. Cps3M and Cps3U should 
function to convert glucose- 6 -phosphate into glucose-1- 

15 phosphate, and glucose -1 -phosphate into UDP-glucose, 

respectively (FIG. 10) . Since UDP-glucose is necessary 
for the production of essential cell constituents, 
including teichoic acid and lipoteichoic acid (Austrian, 
et slI., 1959), the products of other genes may complement 

20 the functions lost in Cps3U and Cps3M mutants. There are 
at least two plausible reasons for the retention of these 
genes in the type- specif ic region. One explanation is 
that their functions cannot be fully duplicated by the 
second enzymes. For example, they may play a role in 

25 regulating the amount of polysaccharide produced. Under 
given conditions, such as during infection, increased 
production of capsule could be advantageous. The large 
noncoding region upstream of cpsSU might be a site at- 
which regulation of cps3U and cp33M occurs. It should 

30 also be noted that the reactions carried out by CpsBM and 
Cps3U are each reversible, and the enzymes might be more 
active in the reverse reaction. Therefore, Cps3U and 
Cps3M might function to limit the amount of capsule 
produced. 

35 

Another possible explanation is that these genes 
were obtained along with the necessary type-specific 
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genes in a horizontal transfer from another organism and 
have not been lost. This theory is consistent with the 
hybridization data indicating that none of the type- 
specific genes could be detected even at low stringency 
5 in strains of six other pneumococcal types, including 
types with related capsular polysaccharide structures 
(Dillard/ et al., 1994). However, if these genes serve 
no necessary function, it is surprising that they have 
been maintained in. the type 3 cassettes of multiple 
10 strains; i.e., the restriction maps of the type 3 regions 
of five non-clonal type 3 strains are identical, and all 
have cp33U and cps3M. 

There are three requirements for a DNA region 
15 to be considered a gene cassette: 1) more than one copy 
of a gene or set of genes must exist, each specifying the 
production of a different, but related, product; 2) each 
copy must be flanked by DNA which is common to all the 
copies; and 3} cassettes must undergo recombination 
20 resulting in the replacement of one copy by another. 
More than 80 different capsular serotypes of S. 
pneumoniae have been identified, and the structures of 
more than half of the polysaccharides have been 
determined (van Dam, et al., 1990). 

25 

The presence of multiple types implies that as many 
different sets of genes exist. The inventors have shown 
that all the necessary genes specific for the production 
of capsules of types 2, 3, and 6B (Example 16) are 

30 closely linked to an approximate 1.2 kb fragment present 
in all capsule types examined. This fragment 
(corresponding to SEQ ID NO: 6 and part of SEQ ID NO: 5, 
see FIG. 4 and Example 5) , cloned from the region 
flanking the type 3 -specific genes, contains a gene with 

35 a sequence virtually identical to a gene fragment from 

type 2 strain, described by Pearce et al., and designated 
plpA (Pearce, et al., 1993). However, the flanking 
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region from type 3 strain is distinct from the sequence 
described by Pearce et al. (1993) in that it is missing 
about one third of the 5' end of the gene, designated 
'plpA. Furthermore Pearce et al. did not identify the 
5 location of the plpA gene nor did they attempt to define 
the sequences on either side of the gene. 

The mapping studies reported here confirm that the 
regions to the right of this fragment are common for at 
10 least 4 kb. The regions map differently to the left of 
the fragment (Example 16) , implying that these regions 
contain the type- specif ic genes in types 2 and 6B, as 
shown herein for type 3 . 

15 The upstream left flanking region from type 3, SEQ 

ID N0:1, SEQ ID N0:2, SEQ ID NO: 3 and SEQ ID N0:4 is 
common to all capsule types examined (2, 3 and 6B, the 
Repeat region was also examined in 5, 6A, 8, 9 and 22; 
Example 17) . However, the presence of multiple copies of 

20 the Repeat fragment (SEQ ID NO: 4) has made the linkage in 
other types difficult to determine and it is possible 
that the repeat region may not flank the type-specific 
genes in other types. 

25 Previous workers have provided biochemical evidence 

of replacement of capsule gene cassettes. When a type 1 
and a type 2 strain were each transformed to type 14 or 
type 23, they no longer produced UDP-glucuronic acid, 
implying that the trsuisformants had lost the UDP-glucose 

30 dehydrogenase gene (Austrian, et al., 1959). Similarly, 
a type 1 strain transfoirmed to type 3 encapsulation no 
longer epimerized UDP-glucuronic acid to UDP-galacturonic 
acid. Molecular evidence is presented herein for the 
replacement of the type -specific genes for a type 3 

35 strain transformed to type 2 encapsulation and a type 2 
strain transformed to type 3 (Example 6) . Together with 
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Example 18, these observations provide strong evidence 
for a cassette organization of the type-specific genes. 

Since the proposal was put forth that capsule genes 
5 are exchanged through a cassette- type recombination, 
there has always been one glaring exception - binary 
encapsulation. At low frequency, strains of certain 
capsule types transformed with DNA from strains of 
certain other capsule types were found to produce both 

10 polysaccharides (Austrian, et al., 1959). Evidently, 
cassette-type recombination had not occurred in these 
transformants since the genes for the original capsule 
had been maintained. Bernheimer et al., found that 
stable binary strains contained the second set of type- 

15 specific genes at a site unlinked to the recipient's 

type-specific genes (Bernheimer, et al., 1967). Unstable 
binary strains frequently lost the donor type -specific 
genes, which were usually located at a site linked to the 
recipient type-specific genes (Bernheimer, et al., 1967; 

20 Bernheimer and Wermundsen, 1969) . 

Based on the hybridization data described here 
concerning the flanking regions and replacement of typis- 
specific genes, as well as the work of Bernheimer 

25 concerning transformation to binary capsule types 

(Bernheimer, et al., 1968; Bernheimer, et al,, 1967; 
Bernheimer, et al., 1969; Bernheimer and Wermundsen, 
1972) , the inventors can now propose models for capsule 
type change and binary capsule type formation. Cassette- 

30 type recombination would result from crossover events' in 
the homologous flanking regions, leading to replacement 
of the type-specific genes. The left crossover could 
take place in the repeated element in strains containing 
this region linked to the type-specific genes but would 

35 occur in flanking DNA further upstream in strains that 
did not contain the repeat. This type of recombination 
is shown in Fig 19A. 
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The finding of a repeated element upstream of the 
type 3 capsule genes (SEQ ID N0:4; Example 17) may- 
provide an explanation for binary encapsulation. It is 
clear that at least one of the copies of the repeated 
5 fragment in types 2 and 6B is unlinked to the capsule 

genes, since neither of the type-specific cassettes could 
be moved with a marker inserted in this location. In 
type 3, a 2.2 kb Hindlll fragment containing the repeat 
element is linked to the type-specific genes but based on 

10 transformation studies, an 8 kb fragment is not. The 
frequency of binary capsule transformants observed by 
Bernheimer et al., was significantly lower (10-1000 fold) 
than related transformations resulting in replacement, 
leading them to suggest that the recombination event 

15 involved strong homology at only one end. Once 

integrated at the "atypical" location (unlinked to the 
type-specific cassette) , the genes for the second capsule 
type could not be moved to the normal location, except by 
transformation of a strain containing the genes of that 

20 type in the normal location, again suggesting that the 
non-type specific flanking DNA on at least one end had 
been lost (Bernheimer, et al., 1967; Bernheimer and 
Wermundsen, 1972) . 

25 Since finding that part of the left flanking DNA 

(SEQ ID N0:4) is repeated in chromosomes of several 
strains, whereas the right flanking region is only 
present in one location, it is proposed that the repeated 
element of the left flanking, region may be involved in 

30 the recombination that results in binary capsule type 

fomnation. The mechanism proposed by Bernheimer et al., 
for stable binary strains could involve homologous 
recombination at a repeated element unlinked to the 
capsule locus; the recombination at the other end of the 

35 capsule genes would occur by an apparent illegitimate 
recombination event, as shown in FIG. 15B. 
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An alternative possibility for the generation of 
stable binary strains, shown in FIG. 15C, involves a 
transposition- like event that could result if certain 
type-specific genes are flanked on both sides by the 
5 repeated element- Unstable binary strains could result 
from either type of integration occurring at repeated 
elements in, or closely linked to, the recipient's type- 
specific genes. Instability could result from 
recombination through genes common to both capsule types, 
10 as suggested by Bernheimer et al., for the UDP-glucose 

dehydrogenases of types 1 and 3 . Results presented here 
provide the basis for examining these possibilities. 
Binary strains containing the two sets of genes linked 
are of particular interest since they might recombine to 
15 form a novel capsule type. Examination of strains 

producing related capsule structures may help elucidate 
the possible mechanisms involved in novel capsule type 
formation. 

Epidemiological studies have indicated that capsule 
type varies independently of other factors, suggesting 
that a substantial amount of genetic exchange has 
occurred (Grain et al., 1990; Coffey et al., 1991; 
Versalovic et al., 1993). Nonetheless, virulence of 
clinical isolates appears to correlate with the capsule 
type expressed (Briles et al., 1992). Taken together, 
these data suggest that the capsule type has a prominent 
role in determining virulence. However, epidemiological 
studies cannot demonstrate a causal relationship between 
capsule type and virulence due to the variability in the 
genetic backgrounds of the different serotypes. The 
characterization of the S. pneumoniae capsule locus 
described here has facilitated the construction of 
isogenic strains differing only in capsule type. These 
strains have been used to e.valuate the role of capsule 
type in virulence (Example 18) . The cloning of capsule 
genes and elucidation of the genetic organization of the 
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capsule locus is a significant step toward understanding 
antigenic variation and virulence in this pathogen. 

Cloning of the Type 3 region 

5 

There are a number of reasons for first cloning the 
type 3 specific genes: the type 3 capsule has a 
relatively simple structure that is expected to require a 
small number of genes for its synthesis; production of 

10 type 3 capsule is an easily identifiable phenotype; and 
finally, the availability of antibodies specific for the 
type 3 polysaccharide capsule allowed rapid screening for 
the presence of the capsule, of a large number of 
isolates, using an ELISA assay as described herein in 

15 Example 1. The approach provided and disclosed allowed 
for the first molecular genetic map of the cps gene 
locus . 

The cloning of additional type-specific genes has 
20 been accomplished using the information derived from the 
present invention. Taking advantage of the non-type 
specific region one can isolate the DNA encoding other 
type-specific genes by simply obtaining a strain of S. 
pneumoniae known to have a type- specif ic capsule. 
25 Polymerase chain reaction using primers specific for. 
opposite flanking regions and directed toward the 
opposite flanking non-type specific region are used to 
amplify the type-specific gene cassette. Where the size 
of the type -specific region is unknown, restriction 
30 fragment length polymorphism analysis, using probes 

specific for either or both of the non-type specific- 
regions may be used to determine the size. 

Antibodies specific for type-specific antigenic 
35 epitopes may be used with the present invention to 
distinguish and evaluate the stability of the S. 
pneumoniae strain prior to, and after cloning of the 
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region. It will also be used for verifying the directed 
transfer of type-specific genes to a prospective host. 
Preferred hosts for polysaccharide capsule production 
will be gram positive bacteria, in particular members of 
5 the Streptococcus, Bacillus and even Staphylococcus 
species • 

Selection of Hosts for Type-Specific Capsule Production 

10 The present invention provides methods for the 

selection, isolation and transformation of Streptococcus 
sp. with a type-specific capsule polysaccharide gene 
locus. A ops locus can now be isolated and used to 
specifically change the capsular phenotype of a selected 

15 host organism. The preferred host organism for use with 
the present invention is a bacteria that produces high 
amounts of the capsular polysaccharide. 

Once a suitable high producing host is identified, 
20 it will be used to carry the type-specific genes of 

choice, as shown in Examples 6 and 18. The organisms can 
be converted to other serotypes by transforming the high 
producing recipient bacteria with a gene cassette or with 
intact genomic DNA. A gene cassette, as previously 
25 mentioned, is a segment of DNA comprising of one or more 
genes flanked by specific DNA sequences which enables 
incorporation of the cassette into a recipient's cell 
chromosome at a specific site or locus via homologous 
recombination. A cassette may contain type-specific 
30 genes, either alone or in combination with non-type 

specific genes. Of course, the preferred construct for 
transfection will be a cassette containing the non-type 
specific flanking regions . 

35 A cassette of the ops locus comprising of the cps 

genes and the 5' and 3' flanking regions donated from any 
one of the 85 S. pneumoniae serotypes may be transformed 
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into a recipient S, pneumoniae also belonging to any one 
of the 85 serotypes. During transformation, recombination 
• would occur in the flanking regions, resulting in the 
replacement of the recipient's type-specific region by 
5 that of the donor. The capsule type of the recipient • 
would be expected to change to that of the donor. 

The introduction of a gene cassette comprising of 
the cpa locus or DNA segment or genetic element, into S. 

10 pneumoniae may be performed by a variety of methods. A 
particularly preferred embodiment would be to digest the 
donor S. pneumoniae genomic DNA with one or more 
restriction enzymes such as those described in Table 1, 
for example, and then separate the entire cps locus from 

15 the rest of the genomic DNA by gel purification. This 
specific DNA segment, cps locus or genetic element, may 
then be ligated into a vector such as a plasmid or cosmid 
or bacteriophage, and transformed by various methods into 
the recipient S. pneumoniae. Alternatively, the donor S. 

20 pneumoniae' s entire genomic DNA may be naturally 

transformed into the recipient by a suitable method, 
e.g., as described in Example 1. Further still, the 
donor's genomic DNA may be digested with one or more 
restriction enzymes and then ligated into a plasmid, 

25 cosmid or bacteriophage, without selecting specifically 
for the cps locus. This may then be transformed into the 
recipient S. pneumoniae. 
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TABLE 1 







RESTRICTION ENZxIlEo 




Aat II 


GACGT/C 




Acc I 


GT/MKAC 


5 


Acc II 


CG/CG 




Acc III 


T/CCGGA 




Aci I 


CCGC(2/2) 




Acy I 


GR/CGYC 




Afl II 


C/TTAAG 


10 


Afl III 


A/CRYGT 




Age I 


A/CCGGT 




Aha III 


TTT/AAA 




Alu I 


AG/CT 




AlwN I 


CAGNNN/CTG 


15 


Aoc I 


CC/TNAGG 




Apa I 


GGGCC/C 




ApaB I 


GCANNNNN/TGC 




ApaL I 


G/TGCAC 




Asc I 


GG/CGCGCC 


20 


Asu I 


G/GNCC 




Asu II 


TT/CGAA 




Ava I 


C/YCGRG 




Ava II 


G/GWCC 




Ava III 


ATGCAT 


25 


Avr III 


C/CTAGG 




Bae I 


ACNNNNGTAYC 




Bal I 


TGG/CCA 




BamH I 


G/GATCC 




Bbv I 


GCAGC(8/12) 


30 


Bbv II 


GAAGAC(2/6) 




Bcc I 


CCATC 




Beef I 


ACGGC (12/13) 




Beg I 


GCANTSINNNNCG (12/10) 




Bel I 


T/GATCA 


35 * 


Bet I 


W/CCGGW 
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Bgl I 


GCCNNNN/NGGC 




Bgl II 


A/GATCT 




Bin I 


GGATC{4/5) 




BpulO I 


CCTNAGC(-5/2) 


5 


Bpull02 I 


GC/TNAGC 




Bspl286 I 


GDGCH/C 




BsplOe I 


AT/CGAT 




BspC I 


CGAT/CG 




BsaA I 


YAC/GTR 


10 


BsaB I 


GATNN/NNATC 




BseP I 


GCGCGC 




Bsg I 


GTGCAG (16/14) 




Bsi I 


CTCGTG(5/1) 




BsiY I 


CCNNNNN/NNGG 


15 


Bsm I 


GAATGC(1/-1) 




BsmA I 


GTCTC(l/5) 




BspBO I 


CG/CG 




BspG I 


CG/CGCTGGAC 




BspH I 


T/CATGA 


20 


BspM 1 


ACCTGC(4/8) 




BspM II 


T/CCGGA 




Bsr I 


ACTGG(1/-1) 




BsrB I 


GAGCGG (-3/-3) 




BstE II 


G/GTNACC 


25 


BstN I 


CC/WGG 




BstX I 


CCANNNNN/NTGG 




Cac8 I 


GCN/NGC 




Cau II 


CC/SGG 




Cfr I 


Y/GGCCR 


30 


CfrlO I 


R/CCGGY 




Cla I 


AT/CGAT 




CviJ I 


RG/CY 




CviR I 


TG/CA 




Dde I 


C/TNAG 


35 


Dpn I 


GA/TC 
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50 



10 



15 



20 



25 



30 



35 



Dra I 
Dra II 
Dra III 
Drd I 
Drd II 
Dsa I 
EamllOS I 
Eci I 
Eco3 II 
Eco47 III 
EC052 I 
Eco57 I 
EcoN I 
EcoR I 
EcoR II 
Ecor V 
Esp I 
Esp3 I 
Fau I 
Fin I 
Fnu4H I 
FnuD II 
Fok I 
Fse I 
Fsi I 
Gdi II 
Gsu I 
Hae I 
Hae II 
Hae III 
Hga I 
HgiA I 
HgaC I 
HgiE II 
HgiJ II 



ttt/aaa 

TG/GNCCY 

CACNNN/GTG 

GACNNNN/NNGTC 

6AACCA 

C/CRYGG 

GACNNN/NNGTC 

TCCGCC 

GGTCTC(l/5) 

AGC/6CT 

C/GGCCG 

CTGAAG (16/14) 

CCTNN/NNNAGG 

G/AATTC 

/CCWGG 

GAT/ATC 

GC/TNAGC 

CGTCTC(l/5) 

CCCGC(4/6) 

GTCCC 

GC/NGC 

CG/CG 

GGATG(9/13) 

GGCCGG/CC 

R/AATTY 

YGGCCG(-5/-l) 

CTGGAG (16/14) 

WGG/CCW 

RGCGC/Y 

GG/CC 

GACGC(5/10) 

GWGCW/C 

G/GYRCC 

ACCNNNNNNGGT 

GRGCY/C 
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Hha I 
Hind II 
Hind III 
Hinf I 
5 Hinl I 
Hpa I 
Hpa II 
Hph I 
Kpn I 

10 Ksp632 I 
Ksp I 
Mae I 
Mae II 
Mae III 

15 Mbo I 
Mbo II 
Mcr I 
Mfe I 
Mlu I 

20 Mly I 
Mme I 
Mnl I 
Mse I 
Msp I 

25 Mst I 
Mst II 
Mwo I 
Nae I 
Nar I 

30 Nci I 
Nco I 
Nde I 
Nhe I 
Nla III 

35 Nla IV 
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GCG/C 

GTY/RAC 

A/AGCTT 

G/AOTC 

GR/CGYC 

GTT/AAC 

C/CGG 

GGTCA(8/7) 

GGTAC/C 

CTCTTC{l/4) 

CCGC/GG 

C/TAG 

A/CGT 

/GTNAC 

/GATC 

GAAGA(8/7) 

CGRY/CG 

C/AATTG 

A/CGCGT 

GACTC(5/5) 

TCCRAC(20/18) 

CCTC(7/7) 

T/TAA 

C/CGG 

TGC/GCA 

CC/TNAGG 

GCNNNNN/NNGC 

GCC/GGC 

GG/CGCC 

CC/SGG 

C/CATGG 

CA/TATG 

G/CTAGC 

CATG/ 

GGN/NCC 
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Not I 


GC/GGCCGC 




Nru I 


TCG/CGA 




Nsi I 


ATGCA/T 




Nsp I 


RCATG/Y 


5 


NspB II 


CMG/CKG 




Pac I 


TTAAT/TAA 




Pal I 


GG/CC 




PflllOS I 


TCGTAG 




Pf IM I 


CCANNNN/NTGG 


10 


Pie I 


GAGTC(4/5) 




PmaC I 


CAC/GTG 




Pme I 


GTTT/AAAC 




PpuM I 


RG/GWCCY 




PshA I 


GACNN/NNGTC 


15 


PspA I 


C/CCGGG 




Pst I 


CTGCA/G 




Pvu I 


CGAT/CG 




Pvu II 


CAG/CTG 




RleA I 


CCCAC:A(12/9) 


20 


Rsa I 


GT/AC 




Rsr II 


CG/GWCCG 




Sac I 


GAGCT/C 




Sac II 


CCGC/GG . 




Sal I 


G/TCGAC 


25 


Sap I 


GCTCTTC(l/4) 




Sau3A I 


/GATC 




Sau96 I 


G/GNCC 




Sau I 


CC/TNAGG 




Sea I 


AGT/ACT 


30 


ScrF I 


CC/NGG 




Sdu I 


GDGCH/C 




Sec I 


C/CNNGG 




SfaN I 


GATC/ (5/9) 




Sfc I 


CTYRAG 


35 


Sfe I 


C/TYRAG 
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Sfi I 
SgrA I 
Sma I 
Sna I 
SnaB I 
Spe I 
Sph I 
Spl I 
Srf I 
Sse838 I 
Ssp I 
Stu I 
Sty I 
Swa I 
Taq I 
Taq II 
Tfi I 
Tsp45 I 
Tsp E I 
Tthlll I 
Tthlll II 
Vsp I 
Xba I 
Xcm I 
Xho I 
Xho II 
Xma I 
Xma III 
Xmn I 



- 53 • 

GGCCNNNN/NGGC 

CR/CCGGYG 

CCC/GGG 

CTATAC 

TAC/GTA 

A/CTAGT 

GCATG/C 

C/GTACG 

GCCC/GGGC 

CCTGCA/GG 

AAT/ATT 

AGG/CCT 

C/CWWGG 

ATTT/AAAT 

T/CGA 

GACCGA(ll/9) 
GAWTC 
GTSAC 
AATT 

GACN/NNGTC 

CAARCA(ll/9) 

AT/TAAT 

T/CTGAGA 

CCANNNNN/NNNNTGG 

C/TCGAG 

R/GATCY 

C/CCGGG 

C/GGCCG 

GAANN/NNTTC 
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Nucleic Acid Hybridization 

The DNA sequences disclosed herein will find utility 
as probes or primers in nucleic acid hybridization 
5 embodiments. As such, it is contemplated that 
oligonucleotide fragments corresponding to the 
sequence (s) of SEQ ID N0:1, SEQ ID N0:2 and SEQ ID N0:3 
(including sequences in between), SEQ ID NO: 4, SEQ ID 
NO: 5 and SEQ ID NO: 6 for stretches of between about 10-14 

10 nucleotides to about 20 or to about 30 nucleotides will 
find particular utility, with even longer sequences, 
e.g., 40, 50, 100, 200, 500, and even up to full length, 
being more preferred for certain embodiments. The 
ability of such nucleic acid probes to specifically 

15 hybridize to non-type-specific and to type-specific- 
encoding sequences will enable them to be of use in a 
variety of embodiments. For example, the probes can be 
used in a variety of assays for detecting the presence of 
complement azy sequences in a given sample, as may be 

20 used, for example to isolate related type-specific genes. 
Alternatively, one may use the non-type-specific regions 
to aid in the isolation and cloning of additional type- 
specific cassettes. However, other uses are envisioned, 
including the use of the sequence information for the 

25 preparation of mutant species primers, or primers for use 
in preparing other genetic constructions. 

Nucleic acid molecules having stretches of about 10- 
14, 20, 30, 50, or even of about 100-200 nucleotides or 

30 so, complementary to SEQ ID N0:1, SEQ ID NO: 2 and SEQ ID 
NO: 3 (including sequences in between) , SEQ ID NO: 4, SEQ 
ID NO: 5 and SEQ ID NO: 6, will have utility as 
hybridization probes. These probes will be useful in a 
variety of hybridization embodiments, such as Southern 

35 and Northern blotting in connection with analyzing 

genomic structure and organization of type-specific genes 
or both linked and non- linked regulatory genes in diverse 
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Strains of S. pneumoniae. The total size of fragment/ as 
well as the size of the complementary stretch (es), will 
ultimately depend on the intended use or application of 
the particular nucleic acid segment. Smaller fragments 
5 will generally find use in hybridization embodiments, 
wherein the length of the complementary region may be 
varied, such as between about 10-14 and about 100 
nucleotides, or even up to full length according to the 
complementary sequences one wishes to detect. 

10 

The use of a hybridization probe of about 10-14 
nucleotides in length allows the formation of a duplex 
molecule that is both stable and selective. Molecules 
having complementary sequences over stretches greater 

15 than 10-14 bases in length are generally preferred, 

though, in order to increase stability and selectivity of 
the hybrid, and thereby improve the quality and degree of 
specific hybrid molecules obtained. One will generally 
prefer to design nucleic acid molecules having gene- 

20 complementary stretches of 15 to 20 nucleotides, or even 
longer where desired. Such fragments may be readily 
prepared by, for example, directly synthesizing the' 
fragment by chemical means, by application of nucleic 
acid reproduction technology, such as the PGR technology 

25 of U.S. Patent 4,603,102 (incorporated herein by 

reference) or by introducing selected sequences into 
recombinant vectors for recombinant production. 

Accordingly, the nucleotide sequences of the 
30 invention may be used for their ability to selectively 
form duplex molecules with complementary stretches of 
non-type- and of type-specific genes. Depending on the 
application envisioned, one will desire to employ varying 
conditions of hybridization to achieve varying degrees of 
35 selectivity of probe towards target sequence. Such 
hybridization conditions are standard in the art and 
include low stringency and high stringency. For 
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applications requiring high selectivity, one will 
typically desire to employ relatively stringent 
conditions to form the hybrids, e.g., one will select 
relatively low salt and\or high temperature conditions, 
5 such as provided by 0.02M-0.15M NaCl at temperatures of 
50 ^'C to 70®C. One particular example is using the 
QuickHyb® system (Stratagene' s Illuminator^ 
Nonradioactive Detection System) at 68®C. Such selective 
conditions tolerate little, if any, mismatch between the 

10 probe and the template or target strand, and would be 

particularly suitable for isolating other genes encoding 
gene products that are involved in the production of 
capsule polysaccharides. A preferred embodiment for 
hybridization conditions is described in detail in 

15 Example 4. Further standard hybridization conditions can 
be found in Sambrook et al., (1989), and are known to 
those of skill in the art. 

Of course, for some applications, for example, where 
20 one desires to prepare mutants employing a mutant primer 
strand hybridized to an underlying template or where one 
seeks to isolate type-specif ic-encoding sequences from 
related species, functional equivalents, or the like, 
less stringent hybridization conditions will typically be 
25 needed in order to allow formation of the heteroduplex. 

One may also desire to employ other hybridization 
techniques and to change salt conditions such as varying 
the amount of salt from between about 0.15M-0.9M; Other 

30 parameters that can be modified may be temperature such 
as those ranging from 20°C to 55**C to optimize the 
signal-to-noise ratio to reduce unwanted background- The 
techniques for optimizing hybridization conditions are 
well known to those of skill in the art and are generally 

35 also described within the . instruction manual for various 
reagents and apparatus . . ' \ 
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In any case, it is generally appreciated that 
conditions can be rendered more stringent by the addition 
of increasing amounts of formamide, which serves to 
destabilize the hybrid duplex in the same manner as 
5 increased temperature. Thus, hybridization conditions 
can be readily manipulated as is known to those of skill 
in the art, and thus will generally be a method of choice 
depending on the desired results. 

10 In certain embodiments, it will be advantageous to 

employ nucleic acid sequences of the present invention in 
combination with an appropriate means, such as a label, 
for determining hybridization. A wide variety of 
appropriate indicator means are known in the art, 

15 including fluorescent, radioactive, enzymatic or other 
ligands, such as avidin/biotin, which are capable of 
giving a detectable signal. In preferred embodiments, 
one will likely desire to employ a fluorescent label or 
an enzyme tag, such as urease, alkaline phosphatase or 

20 peroxidase, instead of radioactive or other environmental 
undesirable reagents. In the case of enzyme tags, 
colorimetric indicator substrates are known which can be 
employed to provide a means visible to the human eye or 
spectrophotometrically, to identify specific 

25 hybridization with complementary nucleic acid-containing 
samples • 

In general, it is envisioned that the hybridization 
probes described herein will be useful both as reagents 

30 in solution hybridization as well as in embodiments 
employing a solid phase. In embodiments involving a 
solid phase, the test DNA (or RNA) is adsorbed or 
otherwise affixed to a selected matrix or surface. This 
fixed, single -stranded nucleic acid is then subjected to 

35 specific hybridization with selected probes under desired 
conditions. The selected conditions will depend on the * 
particular circumstances based on the particular criteria 
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required (depending, for example, on the G+C contents, 
type of target nucleic acid, source of nucleic acid, size 
of hybridization probe, etc.) . Following washing of the 
hybridized surface so as to remove nonspecif ically bound 
5 probe molecules, specific hybridization is detected, or 
even quantified, by means of the label. 

Longer DNA segments will often find particular 
utility in the recombinant production of peptides or 

10 proteins. DNA segments, which encode peptide antigens 

from about 15 to about 50 amino acids in length, or more 
preferably, from about 15 to about 30 amino acids in 
length are contemplated to be particularly useful, as are 
DNA segments encoding entire cps locus encoded proteins, 

15 such as those of SEQ ID N0:1, SEQ ID NO: 2, SEQ ID NO: 3, 
SEQ ID NO: 5 and SEQ ID NO: 6. DNA segments encoding 
peptides will generally have a minimum coding length in 
the order of about 45 to about 150, or to about 90 
nucleotides . 

20 

The nucleic acid segments of the present invention, 
regardless of the length of the coding sequence itself, 
may be combined with other DNA sequences, such as 
promoters, repressors, attenuators, additional 

25 restriction enzyme sites, multiple cloning sites, other 
coding segments, and the like, such that their overall 
length may vary considerably. It is contemplated that a 
nucleic acid fragment of almost any length may be 
employed, with the total length preferably being limited 

30 by the ease of preparation and use in the intended 

recombinant DNA protocol. For example, nucleic acid 
fragments may be prepared in accordance with the present 
invention which are up to about 10,000 base pairs in 
length, with segments of about 5,000 or 3,000 being 

35 preferred and segments of about 1,000 base pairs in 
length being particularly preferred. 
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It will be understood that this invention is not 
limited to the particular nucleic acid sequences of SEQ 
ID N0:1, SEQ ID N0:2, SEQ ID N0:3, SEQ ID N0:4, SEQ ID 
NO: 5 and SEQ ID NO: 6, or to the particular amino acid 
5 sequences of SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ 
ID N0:10, SEQ ID N0:11, SEQ ID N0:12, SEQ ID N0:13, SEQ 
ID NO: 14, SEQ ID NO: 15 and SEQ ID NO: 16. Therefore, DNA 
segments prepared in accordance with the present 
invention may also encode biologically functional 

10 equivalent proteins or peptides which have variant amino 
acids sequences. Such sequences may arise as a 
consequence of codon redundancy and functional 
equivalency which are known to occur naturally within 
nucleic acid sequences and the proteins thus encoded. 

15 Alternatively, functionally equivalent proteins or 
peptides may be created via the application of 
recombinant DNA technology, in which changes in the 
protein structure may be engineered, based on 
considerations of the properties of the amino acids being 

20 exchanged. 

DNA segments encoding a gene, including the cpsB, 
cpsC, cpsE, cspD, cspS, cspU, cspM, plpA and tnpA genes 
may be introduced into recombinant host cells and 

25 employed for expressing and producing the type-specific 
proteins for use in producing type-specific capsule 
polysaccharides ♦ Alternatively, through the application 
of genetic engineering techniques, subportions or 
derivatives of selected type-specific gene locus genes 

30 may be employed. Equally, through the application of 

site-directed mutagenesis techniques, one may re-engineer 
DNA segments of the present invention to alter the coding 
sequence, e.g., to introduce improvements to the 
antigenicity of the protein or to test mutants in order 

35 to examine the production of capsule polysaccharides at 

the molecular level. Where desired, one may also prepare 
fusion peptides, e.g., where the type-specific coding 
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regions are aligned within the same expression unit with 
other proteins or peptides having desired functions, such 
as for immunodetection purposes (e.g., enzyme label 
coding regions) . 

5 

Screening Method Type-Specific Genes 

Screening for type -specific genes provides another 
utility for the cps loci of the present invention. A 

10 type-specific screening protocol will allow for the 

epidemiological identification of S. pneumoniae and its 
serotypes at the molecular level. By using one or both 
of the non-type specific regions as probes one can 
determine the presence of S. pneumoniae from a small 

15 sample by immobilizing DNA from the sample onto a solid 

matrix, for example a slot blot using nitrocellulose, and 
hybridizing thereto a probe as described in the present 
invention. 

20 Using either or both of the non-type specific 

regions of the present invention as a probe or probes one 
may also screen southern blots. The screening of 
southern blots may allow one to determine not only the 
presence of S. pneumoniae but also the exact genotype of 

25 S. pneumoniae present in the sample. In conjunction with 
densitometric analysis of a southern blot containing 
multiple serotypes on may determine not only the relative 
frequency of serotypes within a sample, but in addition 
one may examine the changing characteristics of the 

30 serotypes by examining samples taken at distinct time 
periods . 

It also allows the clinician to determine if a 
patient is having a recurrence of a particular serotype, 
35 if the patient is susceptible to a particular serotype or 
types, or if a new serotype is increasing in the 
population. 
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Slte*Sp ciflc Mutagenesis 

Site-specific mutagenesis, also known as site- 
directed mutagenesis, is a technique useful in the 
5 preparation of changes, directed by the laboratory 

technician, that change the characteristics of genes and 
their gene products, for the addition of restriction 
sites, for modifying the activity of promoters, 
repressors, attenuators, and for directed changes 

10 affecting recombination. All of these changes may be 
produced through specific mutagenesis of the underlying 
non-type- and type-specific DNA of the present invention. 
The technique further provides a ready ability to prepare 
and test sequence variants, for example, incorporating 

15 one or more of the foregoing considerations, by 

introducing one or more nucleotide sequence changes into 
the DNA. Site-specific mutagenesis allows the production 
of mutants through the use of specific oligonucleotide 
sequences which encode the DNA sequence of the desired 

20 mutation, as well as a sufficient number of adjacent 

nucleotides, to provide a primer sequence of sufficient 
size and sequence complexity to form a stable duplex on 
both sides of the deletion junction being traversed. 
Typically, a primer of about 17 to 25 nucleotides in 

25 length is preferred, with about 5 to 10 residues on both 
sides of the junction of the sequence being altered. 

In general, the technique of site-specific 
mutagenesis is well known in the art, as exemplified by 

30 various publications. As will be appreciated, the 

technique typically employs a phage vector which exists 
in both a single stranded and double stranded form. 
Typical vectors useful in site-directed mutagenesis 
include vectors such as the M13 phage. These phage are 

35 readily commercially available and their use is.- generally 
well known to those skilled in the art. Double stranded 
plasmids are also routinely employed in site directed 



wo 95/31548 



PCTAJS9S/06119 



- 62 - 

mutagenesis which eliminates the step of transferring the 
gene of interest from a plasmid to a phage. 

In general, site-directed mutagenesis in accordance 
5 herewith is performed by first obtaining a single- 

stranded vector or melting apart the two strands of a 
double stranded vector which includes within its sequence 
a DNA sequence which encodes the type-specific protein or 
proteins encoded by the type-specific gene locus. An 

10 oligonucleotide primer bearing the desired mutated 
sequence is prepared, generally synthetically. This 
primer is then annealed with the single -stranded vector, 
and subjected to DNA polymerizing enzymes such as E. coJi 
polymerase I Klenow fragment, in order to complete the 

15 synthesis of the mutation-bearing strand. Thus, a 

heteroduplex is formed wherein one strand encodes the . 
original non -mutated sequence and the second strand bears 
the desired mutation. This heteroduplex vector is then 
used to transform appropriate cells, such as E. coli 

20 cells, and clones are selected which include recombinant 
vectors bearing the mutated sequence arrangement. 



The preparation of sequence variants of type- 
specific genes using site-directed mutagenesis is 

25 provided as a means of producing potentially useful 

species, for example a strain having enhanced production 
of type-specific capsular polysaccharides, and is not 
meant to be limiting as there are other ways in which 
sequence variants of other type -specific genes may be 

30 obtained. For example, recombinant vectors encoding 

other type -specific genes, as described herein using the 
non -type specific regions of the capsule polysaccharide 
gene cassette are encompassed. 
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Biological Fimctional £gulvale&ts 

Even though the invention has been described with a 
certain degree of particularity, it is evident that many 
5 alternatives, modifications, and variations will be 
apparent to those skilled in the art in light of the 
foregoing disclosure. Accordingly, it is intended that 
all such alternatives, modifications, and variations 
which fall within the spirit and the scope of the 
10 invention be embraced by the defined claims . 

As used in this application, the term "DNA segment" 
refers to a DNA molecule that has been isolated free of 
total genomic DNA of a particular species. Therefore, a 

15 DNA segment encoding the cps locus refers to a DNA 

segment that contains the 5' and/or 3' flanking regions, 
or the cpsB, cpsC, cpsE, cpsD, cpaS, cpsU, cpsM, tnpA or 
pIpA coding sequences, yet is isolated away from, or 
purified free from, total genomic DNA of 5. pneumoniae. 

20 Included within the term "DNA segment", are DNA.; segments 
and smaller fragments of such segments, and also 
recombinant vectors, including, for example, plasmids, 
cosmids, phage, viruses, and the like. 

25 Similarly, a DNA segment comprising an isolated or 

purified 5' or 3' flanking region, or cpsB, cpsC, cpsE, 
cpsD, cpsS, cpaU, cpsM, plpA or even tnpA gene, refers to 
a DNA segment including the coding sequences and, in 
certain aspects, regulatory sequences, isolated 

30 substantially away from other naturally occurring genes 

or protein encoding sequences. In this respect, the term 
"gene" is used for simplicity to refer to a protein, 
polypeptide or peptide encoding unit. As will be 
understood by those in the art, this term includes both 

35 genomic sequences, cDNA sequences and smaller engineered 
gene segments that express, or may be adapted to express, 
proteins, polypeptides or peptides. 
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"Isolated substantially away from other coding 
sequences" means that the locus of interest, in this case 
the 5' or 3' flanking regions, or cps B, cpsC, cpsE, 
cpsD, cpsS, cpsU, cpsM, tnpA or plpA coding sequences, 
5 forms the significant part of the DNA segment, and that 
the DNA segment does not contain large portions of 
naturally-occurring coding DNA, such as large chromosomal 
fragments or other functional genes or cDNA coding 
regions. Of course, this refers to the DNA segment as 
10 originally isolated, and does not exclude genes or coding 
regions later added to the segment by the laboratory 
technician. 

In particular embodiments, the invention concerns 
15 isolated DNA segments and recombinant vectors 

incorporating DNA sequences that include the 5' or 3' 
flanking regions denoted by SEQ ID N0:1, SEQ ID N0:2, SEQ 
ID NO: 3 and SEQ ID NO: 4 or SEQ ID NO: 6, respectively, 
DNA segments and vectors that incorporate DNA sequences 
20 that encode CpsB, CpsC, CpsE, CpsD, CpsS, CpsU, CpsM, 

PlpA or transposase A proteins that include within their 
amino acid seqaences an amino acid sequence as set forth 
in SEQ ID N0:7, SEQ ID N0:8, SEQ ID N0:9, SEQ ID NO:10, 
SEQ ID N0:11, SEQ ID N0:12, SEQ ID N0:13, SEQ ID N0:14, 
25 SEQ ID NO: 15 or SEQ ID NO: 16 respectively are also 
included. 

The term "a sequence as set forth in SEQ ID NO: 7 -16" 
means that the sequence substantially corresponds to a 

30 portion of SEQ ID NO: 7-16 and has relatively few amino 
acids that are not identical to, or a biologically 
functional equivalent of, the amino acids of SEQ ID N0:7- 
16. The term "biologically functional equivalent" is 
well understood in the art and is further defined in 

35 detail herein. Accordingly, sequences that have between 
about 75% and about 85%; or more preferably, between 
about 86% and about 95%; or even more preferably, between 
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about 96% and about 99%; of amino acids that are 
identical or functionally equivalent to the amino acids 
of SEQ ID NO: 7-16 will be sequences that are "essentially 
as set forth in SEQ ID NO: 7-16." 

5 

Naturally, it will be understood that for the Cps 
proteins, the definition of "equivalents" in this sense 
does not extend to distinct, but homologous proteins, 
such as CpsD and HasB from Streptococcus pyogenes; CpsS 
10 and HasA from S. pyogenes, NodC from Rhizobimn meliloti; 
nor CpsU and GtaB from Bacillus svbtilis. Rather, the 
scope of equivalents contemplated are such that the 
changes made still result in a protein that is 
structurally and functionally a Cps protein. 

15 

In certain other embodiments, the invention concerns 
isolated DNA segments and recombinant vectors that 
include within their sequence a nucleic acid sequence 
essentially as set forth in SEQ ID N0:1, SEQ ID NO: 2, SEQ 

20 ID N0:3, SEQ ID NO: 4 and SEQ ID NO: 6, for the flanking 
regions; and SEQ ID NO: 5 for the type -specific encoding 
regions. The term "as set forth in SEQ ID N0:5" is used 
in the same sense as described above and means that the 
nucleic acid sequence substantially corresponds to a 

25 portion of SEQ ID NO: 5 and has relatively few codons that 
are not identical/ or functionally equivalent , to the 
codons of SEQ ID NO: 5. The term "functionally equivalent 
codon" is used herein to refer to codons that encode the 
same amino acid, such as the six codons for arginine or 

30 serine, and also refers to codons that encode 

biologically equivalent amino acids (as in Table 2) . 

It will be understood that acid and nucleic acid 
sequences may include additional residues, such as 
35 additional N- or C-terminal amino acids or 5' or 3' 

sequences, and yet still be essentially as set forth in 
one of the sequences disclosed herein, so long as the 
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sequence meets the criteria set forth above, including 
the maintenance of biological protein activity where 
protein expression is concerned. The addition of 
terminal sequences particularly applies to coding nucleic 
5 acid sequences that may, for example, include various 
non-coding sequences flanking either of the 5' or 3' 
portions of the coding region, such as promoters. 

Allowing for the degeneracy of the genetic code, 
10 sequences that have between about 75% and about 85%; or 
more preferably, between about 85% and about 95%; or even 
more prefersQDly, between about 95% and about 99% of 
nucleotides that are identical to the nucleotides of SEQ 
ID N0:1, SEQ ID N0:2, SEQ ID NO : 3 , SEQ ID N0:4, SEQ ID 
15 N0:5 or SEQ ID N0:6 (SEQ ID N0:l-6), will be sequences 

that are "as set forth in SEQ ID NO: 1-6". Sequences that 
are essentially the same as those set forth in SEQ ID 
NO: 1-6 may also be functionally defined as sequences that 
are capable of hybridizing to a nucleic acid segment 
20 containing the complement of SEQ ID NO: 1-6 under standard 
conditions. Suitable hybridization conditions will be 
well known to those of skill in the art and are clearly 
set forth herein, e.g., see Example 4. 
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Tabl 2. i^no Acids and th Corresponding Codons. 

Codons 





Alanine 


Ala 


A 


GCA 


GCC 


GCG 


GCU 






Cysteine 


Cys 






TTnTT 








c 
D 


Aspaxuic 
acid 


Asp 


u 




oAU 










Glutamic 
acid 


Glu 


E 


GAA 


GAG 








n n 
XU 


Phenyl al- 
' anine 


Phe 


F 


UUC 


uuu 










uxycxne 


oxy 


a 






nnp 








nxouxuxxic 


Til fi 


u 

IT 


PAP 


PUTT 










X soxeucxne 


xxe 


X 




21TTP 


ATTTT 








Xjysine 


Lys 


V 

Jx 


AAA 


AAVj 










jjeucine 
ine u n X on X ne 


Leu 

jxiet. 


T" 

Xi 


UUA 
AUVj 


TTTTr' 


LUA 




CUG 




Asparagine 


Asn 


N 


AAC 


AAU 










Proline 


Pro 


P 


CCA 


CCC 


CCG 


ecu 






Glutamine 


Gin 


Q 


CAA 


CAG 








20 


Arginine 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


•CGG 




Serine 


Ser 


S 


AGC 


AGU 


UCA 


UCC 


UCG 




Threonine 


Thr 


T 


ACA 


ACC 


ACG 


ACU 






Valine 


Val 


V 


GUA 


GUC 


GUG 


GUU 






Tryptophan 


Trp 


W 


UGG 










25 


Tyrosine 


Tyr 


Y 


UAC 


UAU 









Naturally, the present invention also encompasses 
DNA segments that are complementary, or essentially 

30 complementary, to the sequences set forth in SEQ ID N0:1, 
SEQ ID N0:2, SEQ ID N0:3, SEQ ID N0:4, SEQ ID N0:5 and 
SEQ ID NO: 6. Nucleic acid sequences that are 
"complementary" are those that are capable of base- 
pairing according to the standard Watson-Crick 

35 complementarity rules. As used herein, the term 

"complementary sequences" means nucleic acid sequences 
that are substantially complementary, as may be assessed 
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by the same nucleotide comparison set forth above, or as 
defined as being capable of hybridizing to the nucleic 
acid segment of SEQ ID NO: 1-6 under relatively stringent 
conditions such as those described herein as SEQ ID N0:1, 
5 SEQ ID N0:2, SEQ ID N0:3, SEQ ID N0:4, SEQ ID N0:5 and 
SEQ ID N0:6. 

The DNA segments of the present invention include 
those encoding biologically functional equivalent 

10 proteins and peptides. Such sequences may arise as a 
consequence of codon redundancy and functional 
equivalency that are known to occur naturally within 
nucleic acid sequences and the proteins thus encoded. 
Alternatively, functionally equivalent proteins or 

15 peptides may be created via the application of 

recombinant DNA technology, in which changes in the 
protein structure may be engineered, based on 
considerations of the properties of the amino acids being 
exchanged. Changes designed in the laboratory, may be 

20 introduced through the application of site-directed 

mutagenesis techniques, e.g., to introduce improvements 
to the antigenicity of the protein or to test S. 
pneumoniae mutants in order to examine capsular 
productivity at the molecular level, 

25 

If desired, one may also prepare fusion proteins and 
peptides, e.g., where the cpsB, cpsC, cpsE, cpsD, cpaS, 
cpsU, cpsM, plpA and tnpA coding regions are aligned 
within the same expression unit with other proteins or 
30 peptides having desired functions, such as for 

purification or immunodetection purposes (e.g., proteins 
that may be purified by affinity chromatography and 
enzyme label coding regions, respectively) , 

35 As mentioned above, modification and changes may be 

made in the structure of CpsB, CpsC, CpsE, CpsD, CpsS, 
CpsU, CpsM, plpA or tnpA and still obtain a molecule 
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having like or otherwise desirable characteristics. For 
example, certain amino acids may be substituted for other 
amino acids in a protein structure without appreciable 
loss of interactive binding capacity with structures such 
5 as, for example, antigen- binding regions of antibodies or 
binding sites on substrate molecules, receptors, or 
catalytic regulation of capsular polysaccharide 
production. Since it is the interactive capacity and 
nature of a protein that defines that protein's 

10 biological functional activity, certain amino acid 

sequence substitutions can be made in a protein sequence 
(or, of course, its underlying DNA coding sequence) and 
nevertheless obtain a protein with like (agonistic) 
properties. Equally, the same considerations may be 

15 employed to create a protein or polypeptide with 

countervailing (e.g., antagonistic) properties. It is 
thus contemplated by the inventors that various changes 
may be made in the sequence of SEQ ID NO: 7, SEQ ID NO: 11, 
SEQ ID N0:12, SEQ ID N0:13 or SEQ ID N0:14, SEQ ID N0:15 

20 or SEQ ID NO: 16 proteins or peptides (or underlying DNA) 
without appreciable loss of their biological utility or 
activity. 

It is also well understood by the skilled artisan 
25 that, inherent in the definition of a biologically 

functional equivalent protein or peptide, is the concept 
that there is a limit to the number of changes that may 
be made within a defined portion of the molecule and 
still result in a molecule with an acceptable level of 
30 equivalent biological activity. Biologically functional 
equivalent peptides are thus defined herein as those 
peptides in which certain, not most or all, of the amino 
acids may be substituted. In particular, the function of 
given protein must be retained to be an equivalent. Of 
35 course, a plurality of distinct proteins /peptides with 
different substitutions may easily be made and used in 
accordance with the invention. 
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It is also well understood that where certain 
residues are shown to be particularly important to the 
biological or structural properties of a protein or 
peptide, e.g., residues in active sites, such residues 
5 may not generally be exchanged. This is the case in the 
present invention, where CpsD has a putative NAD-binding 
site and active site region at residues 2 to 29 and 251- 
263 (SEQ ID NO: 11) respectively. 

10 Amino acid substitutions are generally based on the 

relative similarity of the amino acid side-chain 
substituents, for example, their hydrophobicity, 
hydrophilicity, charge, size, and the like. An analysis 
of the size, shape and type of the amino acid side-chain 

15 substituents reveals that arginine, lysine and histidine 
are all positively charged residues; that alanine, 
glycine and serine are all a similar size; and that 
phenylalanine, tryptophan and tyrosine all have a 
generally similar shape. Therefore, based upon these 

20 considerations, arginine, lysine and histidine; alanine, 
glycine and serine; and phenylalanine, tryptophan and 
tyrosine; are defined herein as biologically functional 
equivalents . 

25 In making changes, the hydropathic index of amino 

acids may be considered. Each amino acid has been 
assigned a hydropathic index on the basis of their 
hydrophobicity and charge characteristics, these are: 
isoleucine (+4.5); valine (+4.2); leucine (+3.8); 

30 phenylalanine (+2.8); cysteine/cystine (+2.5); methionine 
(+1.9); alanine (+1.8); glycine (-0.4); threonine (-0.7); 
serine (-0.8); tryptophan (-0.9); tyrosine (-1.3); 
proline (-1.6); histidine (-3.2); glutamate (-3.5); 
glutamine (-3.5); aspartate (-3.5); asparagine (-3.5); 

35 lysine (-3.9); and arginine (-4.5). 
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The importance of the hydropathic amino acid index 
in conferring interactive biological function on a 
protein is generally understood in the art (Kyte & 
Doolittle, 1982, incorporated herein by reference) . It 
5 is known that certain amino acids may be siibstituted for 
other amino acids having a similar hydropathic index or 
score and still retain a similar biological activity • In 
making changes based upon the hydropathic index, the 
substitution of amino acids whose hydropathic indices are 
10 within ±2 is preferred, those which are within ±1 are 
particularly preferred, and those within ±0.5 are even 
more particularly preferred. 

It is also understood in the art that the 
15 substitution of like amino acids can be made effectively 
on the basis of hydrophilicity. U.S. Patent 4,554,101, 
incorporated herein by reference, states that the 
greatest local average hydrophilicity of a protein, as 
governed by the hydrophilicity of its adjacent amino 
20 acids, correlates with its immunogenicity and 

antigenicity, i.e. with a biological property of the 
protein. It is therefore understood that an amino acid 
can be substituted for another having a similar 
hydrophilicity value and still obtain a biologically 
25 equivalent, and in particular, an immunologically 
equivalent protein. 

As detailed in U.S. Patent 4,554,101, the following 
hydrophilicity values have been assigned to amino acid 

3 0 residues : arginine (+3.0); lysine (+3.0) ; aspartate (+3.0 
± 1); glutamate (+3.0 ± 1); serine (+0.3); asparagine 
(+0.2); glutamine (+0.2); glycine (0); threonine (-0.4); 
proline (-0.5 ± 1); alanine (-0.5); histidine (-0.5); 
cysteine (-1.0); methionine (-1.3); valine (-1.5); 

35 leucine (-1.8); isoleucine (-1.8); tyrosine (-2.3); 
phenylalanine (-2.5); tryptophan (-3.4). 



wo 9S/31S48 



PCTAJS9S/06119 



- 72 - 

In making changes based upon similar hydrophilicity 
values, the substitution of amino acids whose 
hydrophilicity values are within ±2 is preferred, those 
which are within ±1 are particularly preferred, and those 
5 within ±0.5 are even more particularly preferred. 

As outlined above, amino acid substitutions are 
generally therefore based on the relative similarity of 
the amino acid side-chain substituents, for example, 

10 their hydrophobicity, hydrophilicity, charge, size, and 
the like. Exemplary siabstitutions which take various of 
the foregoing characteristics into consideration are well 
known to those of skill in the art and include: arginine 
and lysine; glutamate and aspartate; serine and 

15 threonine; glut amine and asparagine; and valine, leucine 
and isoleucine. 

Antibody Generation 

20 Means for preparing and characterizing antibodies 

are well known in the art (See, e.g., Antibodies: A 
Laboratory Manual, Cold Spring Harbor Laboratory, 1988; 
incorporated herein by reference) . This invention thus 
contemplates the generation .of antibodies against the 

25 proteins CpsB, CpsC, CpsE, CpsD, CpsS, CpsU, CpsM, PlpA 
and transposase A or peptides derived therefrom. The 
CpsB, CpsC, CpsE, CpsD, CpsS, CpsU, CpsM, PlpA and 
transposase A proteins or peptides may be obtained using 
standard methods of recombinant expression as is 

30 routinely in the art. 

The methods for generating monoclonal antibodies 
(MAbs) generally begin along the same lines as those for 
preparing polyclonal antibodies. Briefly, a polyclonal 
35 antibody is prepared by immunizing an animal with an 
immunogenic composition in accordance with the present 
invention and collecting antisera from that immunized 
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animal. A wide range of animal species can be used for 
the production of antisera. Typically the animal used 
for production of anti-antisera is a rabbit, a mouse, a 
rat, a hamster, a guinea pig or a goat. Because of the 
5 relatively large blood volume of rabbits, a rabbit is a 
preferred choice for production of polyclonal antibodies. 

As is well known in the art, a given composition may 
vary in its immunogenicity . It is often necessary 

10 therefore to boost the host immune system, as may be 

achieved by coupling a peptide or polypeptide immunogen 
to a carrier. Exemplary and preferred carriers are 
keyhole limpet hemocyanin (KLH) and bovine serum albumin 
(BSA) . Other albumins such as ovalbumin, mouse serum 

15 albumin or rabbit serum albumin can also be used as 
carriers. Means for conjugating a polypeptide to a 
carrier protein are well known in the art and include 
glutaraldehyde , m-maleimidobencoyl -N-hydroxysuccinimide 
ester, carbodiimyde and bis-biazotized benzidine. 

20 

As is also well known in the art, the immunogenicity 
of a particular immunogen composition can be enhanced by 
the use of non-specific stimulators of the immune 
response, known as adjuvants. Exemplary and preferred 
25 adjuvants include complete Freund's adjuvant (a non- 
specific stimulator of the immune response containing 
killed Mycobacterium tuberculosis^ , incomplete Freund' s 
adjuvants and aluminum hydroxide adjuvant. 

30 The amount of immunogen composition used in the 

production of polyclonal antibodies varies upon the 
nature of the immunogen as well as the animal used for 
immunization. A variety of routes can be used to 
administer the immunogen (subcutaneous, intramuscular, 

35 intradermal, intravenous and intraperitoneal) . The 

production of polyclonal antibodies may be monitored by 
sampling blood of the immunized animal at various points 
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following immunization. A second, booster injection, may 
also be given. The process of boosting and titering is 
repeated until a suitable titer is achieved. When a 
desired level of immunogenicity is obtained, the 
5 immunized animal can be bled and the serum isolated and 
stored, and/or the animal can be used to generate MAbs. 

MAbs may be readily prepared through use of well- 
known techniques, such as those exemplified in U.S. 

10 Patent 4,196,265, incorporated herein by reference. 

Typically, this technique involves immunizing a suitable 
animal with a selected immunogen composition, e.g., a 
purified or partially purified CpsB, CpsC, CpsE, CpsD, 
CpsS, CpsU, CpsM, PlpA or transposase A protein, 

15 polypeptide or peptide. The immunizing composition is 

administered in a manner effective to stimulate antibody 
producing cells. Rodents such as mice and rats are 
preferred animals, however, the use of rabbit, sheep frog 
cells is also possible. The use of rats may provide 

20 certain advantages (Goding, 1986, pp. 60-61), but mice 

are preferred, with the BALB/c mouse being most preferred 
as this is most routinely used and generally gives a 
higher percentage of stable fusions. 

25 Following immunization, somatic cells with the 

potential for producing antibodies, specifically B 
lymphocytes (B cells), are selected for use in the MAb 
generating protocol. These cells may be obtained from 
biopsied spleens, tonsils or lymph nodes, or from a 

30 peripheral blood sample. Spleen cells and peripheral 

blood cells are preferred, the former because they are a 
rich source of antibody-producing cells that are in the 
dividing plasmablast stage, and the latter because 
peripheral blood is easily accessible. Often, a panel of 

35 animals will have been immunized and the spleen of animal 
with the highest antibody titer will be removed and the 
spleen lymphocytes obtained by homogenizing the spleen 



wo 9S/31548 



PCT/US95/06119 



- 75 - 

with a syringe. Typically, a spleen from an immunized 
mouse contains approximately 5 X 10*^ to 2 X 10® 
lymphpcytes- 

5 The antibody-producing B lymphocytes from the 

immunized animal are then fused with cells of an immortal 
myeloma cell, generally one of the same species as the 
animal that was immunized. Myeloma cell lines suited for 
use in hybridoma -producing fusion procedures preferably 
10 are non-antibody-producing, have high fusion efficiency, 
and enzyme deficiencies that render then incapable of 
growing in certain selective media which support the 
growth of only the desired fused cells (hybridomas) . 

15 Any one of a number of myeloma cells may be used, as 

are known to those of skill in the art (Coding, pp. 
65-66, 1986; Campbell, pp. 75-83, 1984). cites). For 
example, where the immunized animal is a mouse, one may 
use P3-X63/Ag8, X63-Ag8.653, NSl/l.Ag 4 1, Sp210-Agl4, 

20 FO, NSO/U, MPC-11, MPC11-X45-GTG 1.7 and S194/5XX0 Bui; 
for rats, one may use R210.RCY3, Y3-Ag 1.2.3, IR983P and 
4B210; and U-266, GM1500-GRG2, LICR-L0N-HMy2 and UC729-6 
are all useful in connection with human cell fusions. 

25 One preferred murine myeloma cell is the NS-l 

myeloma cell line (also teitned P3-NS-l-Ag4-l) , which is 
readily available from the NIGMS Human Genetic Mutant 
Cell Repository by requesting cell line repository number 
GM3573. Another mouse myeloma cell line that may be used 

30 is the 8-azaguanine-resistant mouse murine myeloma SP2/0 
non-producer cell line. 

Methods for generating hybrids of antibody-producing 
spleen or lymph node cells and myeloma cells usually 
35 comprise mixing somatic cells with myeloma cells in a 2:1 
proportion, though the proportion may vary from about 
20:1 to about 1:1, respectively, in the presence of an 
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agent or agents (chemical or electrical) that promote the 
fusion of cell membranes. Fusion methods using Sendai 
virus have been described by Kohler and Milstein (1975; 
1976) , and those using polyethylene glycol (PEG) , such as 
5 37% (v/v) PEG, by Gefter et ai. (1977). The use of 

electrically induced fusion methods is also appropriate 
(Goding pp. 71-74, 1986). 

Fusion procedures usually produce viable hybrids at 

10 low frequencies, about 1 X 10"^ to 1 X 10"®. However, 
this does not pose a problem, as the viable, fused 
hybrids are differentiated from the parental, unfused 
cells (particularly the unfused myeloma cells that would 
normally continue to divide indefinitely) by culturing in 

15 a selective medium. The selective medium is generally 
one that contains an agent that blocks the de novo 
synthesis of nucleotides in the tissue culture media. 
Exemplary and preferred agents are aminopterin, 
methotrexate, and azaserine. Aminopterin and 

20 methotrexate block de novo synthesis of both purines and 
pyrimidines, whereas azaserine blocks only purine 
synthesis. Where aminopterin or methotrexate is used, 
the media is supplemented with hypoxanthine and thymidine 
as a source of nucleotides (HAT medium) . Where azaserine 

25 is used, the media is supplemented with hypoxanthine. 

The preferred selection medium is HAT. Only cells 
capable of operating nucleotide salvage pathways are able 
to survive in HAT medium. The myeloma cells are 

30 defective in key enzymes of the salvage pathway, e.g., 

hypoxanthine phosphoribosyl transferase (HPRT) , and they 
cannot survive. The B cells can operate this pathway, 
but they have a limited life span in culture and 
generally die within about two weeks. Therefore, the 

35 only cells that can survive in the selective media are 
those hybrids formed from myeloma and B cells. 
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This culturing provides a population of hybridomas 
from which specific hybridomas are selected. Typically, 
selection of hybridomas is performed by culturing ^ the 
cells by single-clone dilution in microtiter plates, 
5 followed by testing the individual clonal supematants 
(after about two to three weeks) for the desired 
reactivity. The assay should be sensitive, simple and 
rapid, such as radioimmunoassays, enzyme immunoassays, 
cytotoxicity assays, plaque assays, dot immunobinding 
10 assays, and the like. 

The selected hybridomas would then be serially 
diluted and cloned into individual antibody-producing 
cell lines, which clones can then be propagated 

15 indefinitely to provide MAbs. The cell lines may be 

exploited for MAb production in two basic ways. A sample 
of the hybridoma can be injected (often into the 
peritoneal cavity) into a histocompatible animal of the 
type that was used to provide the somatic and myeloma 

20 cells for the original fusion. The injected animal 
develops tumors secreting the specific monoclonal 
antibody produced by the fused cell hybrid. The body 
fluids of the animal, such as serum or ascites fluid, can 
then be tapped to provide MAbs in high concentration. 

25 The individual cell lines could also be cultured in 
vitro, where the MAbs are naturally secreted into the 
culture medium from which they can be readily obtained in 
high concentrations. MAbs produced by either means may 
be further purified, if desired, using filtration, 

30 centrifugation and various chromatographic methods such 
as HPLC or affinity chromatography. 

The following examples are included to demonstrate 
preferred embodiments of the invention. It should be 
35 appreciated by those of skill in the art that the • 
techniques disclosed in the examples which follow- 
represent techniques discovered by the inventor to 
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function well in the practice of the invention, and thus 
can be considered to constitute preferred modes for its 
practice. However, those of skill in the art should, in 
light of the present disclosure, appreciate that many 
5 changes can be made in the specific embodiments which are 
disclosed and still obtain a like or similar result 
without departing from the spirit and scope of the 
invention. 

10 EXAMPLE 1 

Isolation and Characteriaation of Capsul e Mufcan^g 

A. Methods 

I. Bacterial strains, plasmids, and culture conditions 
15 The bacterial strains and plasmids used are listed 

herein in Table 3. Culture conditions for S, pneumoniae 
and E. coli were previously described (Dillard and 
Yother, 1991). Erythromycin was used at 0.3 iig/ml and 
streptomycin was used at 100 /xg/ml in S. pneumoniae 
20 cultures where indicated. Chloramphenicol was used at 1 
ixg/ml to detect transcription in 3. pneimoniae isolates 
carrying pJY4163/4164 chromosomal insertions. 

Table 3. Bacterial strains and plasmids. 
25 

Strain/ Derivation and Source/Ref eirence 
Plasmid properties 

Strain 

S. pneumoniae 

30 WU2 Type 3 encapsulated Briles et al. 



(1991b) 



D39 



Type 2 encapsulated 



Avery et al. 
(1944) 



Rxl 



Non-encapsulated 
mutant of R36A-A66 
hybrid 



Ravin (1959) 
Shoemaker and 
Guild (1974) 



L8-2006 



Type 1 encapsulated 



Dillard and 
Yother (1994) 



DBL5 



Type 5 encapsulated 



Yother et al. 
(1982) 
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Strain/ 
Plasxnid 



Derivation and 
properties 



Source/Re£er nee 



BG9273 

EP2809 

305862 

LMIOO 

A66R2 

€61 



Type 6A encapsulated 

Type 8 encapsulated 

Type 9 encapsulated 

Type 22 encapsulated 

Non-encapsulated 
mutant of 
A66 (Type 3) 

Non-encapsulated 
mutant of 
A66 (Type 3) 



Dillard and 
Yother (1994) 

Dillard and 
Yother (1994) 

Dillard and 
Yother (1994) 

Dillard and 
Yother (1994) 

Muckerman et al. 
(1982) 

Bernheimer and 
Wermundsen ( l 972 ) 





JD531 


Non-encapsulated 

mutant of 

WU2, Em^, cpsAl 


This 


work 




JD541 


Non-encapsulated 

mutant of 

WU2, EM^, cpsA2 


This 


work 




JD542 


Non-encapsulated 

mutant of 

WU2, Em^, cpsBl 


This 


work 


10 


JD551 


Non-encapsulated 

mutant of 

WU2, Em^, cp3B2 


This 


work 




JD571 


Non- encapsulated 

mutant of 

WU2, Em^, cp3B3 


This 


work 




JD600 


WU2 Str^ 


This 


work 




JD611 


JD600 X JD531, Em®, 
Str^, cpsAl 


This 


work 




JD614 


JD600 X JD551, Em^, 
Str^, cpaB2 


This 


work 


15 


JD619 


JD600 X JD541, Em^, 
Str^, cp8A2 


This 


work 




JD692 


JD600 X JD542, Em®, 
Str^, cpaBl 


This 


work 




JD816 


JD600 X JD671, Em®, 
Str^, cpsB3 


This 


work 




JD636 


WU2 Str^, RM^, Nov^ 


This 


work 
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Strain/ 
Plassiid 



Derivation and 
properties 



Sourc /Referenc 



JD752 

JD770 

JD803 

JD871 

5 JDS 72 

JD875 

JD908 

E. coli 
LE3 92 



10 



pJY4163 

and 
pjy4164 



15 



PJD330 

pJD337 

pJD343 
pJD345 
pJD351 



Isolate of 
transformation pool 
W62 X JD811, type 3 
encapsulated, Em^ 

pJD330 X WU2, type 3 
encapsulated, Em*^ 

JD770 X D39, type 3 
encapsulated, Em^ 

pJD366 X D39, type 2 
encapsulated, Em*^ 

JD871 X WU2, type 2 
encapsulated, Em-^ 

pJD36G X DBL5, type 5 
encapsulated, Em^ 

pJD369 X WU2, non- 
encapsulated, Em^ 

F'h3dR514 IT^' Mj^' 
) supE44 supFSB A 
{faciZY)6 galK2 
galT22 met21 trpR55 X 

Lack origin of 
replication for 
S. pneumoniae 
Promoterless cat gene 
downstream of 
multiple cloning site 
(opposite 

orientations in 4163, 
4164), Em^ 

WU2 5au3Al fragment 
cloned into pJY4163 
BairHI site, isolated 
from JD752 

pJY4163:: 1 . 5kb Xbal 
-PvuII fragment of 
pJD330 

pJY4164:: 0.4kbMunI 
fragment of pJD330 

PJY4164: : 1 . Ikb Miml 
fragment of pJD330 

pJY4164:: 2.4kb 
Sau3AI-Sau3AI 
fragment of pJD330 
orientation opposite 
pJD330 



This work 

This work 
This work 
This work 
This work 
This work 
This work 



Tilghman et ai. 
(1977) 



Yother et al, 
(1992) 



This work 

This work 

This work 
This work 
This work 
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Strain/ 
Plasmld 


Derivation and 
properties 


Source/Reference 


pJD353 


pJY4164:: 1 . 6kb 
SauSAl-Xbal fragment 
of pJD330 


This 


work 


pJD357 


PJY4164:: 0.3kbMimI- 
£fau3AI fragment of 
JD330 


This 


work 


pJD359 


pjy4164:: 0 . 6kb 
PvuII-Haelll fragment 
of pJD330 


This 


work 


pJD361 


pjy4164:; 0.45kb 
Xbal -Ps tl fragment of 
pJD33 0 


This 


work 


pJD362 


pjy4164:: 0.4kb 
Haelll-MunI fragment 
of pJD330 


This 


work 


pJD364 


WU2 3.2kb Hindi 11 
fragment cloned into 
pJY4164 Hindu site 


This 


work 


pJD366 


WU2 3.2kb Hindi I I 
fragment cloned into 
pJY4164 Hindlll site, 
orientation opposite 
pJD364 


This 


work 


pJD368 


pJD4164:: 0.45kb 
Rsal'-MuxiL fragment of 
p«JD330 


This 


work 


pJD369 


pJY4164:: 0.55 kb 
Pvull-Mxml fragment 
of pJD330 


This 


work 


pJD374 


WU2 1.2kb Sau3AI 
fragment cloned in 
pjy4163 


This 


work 


PJD377 


pJY4164:: 1.2kb 5acl- 
Hindi I I fragment of 
pJD364 


This 


work 


pJD380 


pjy4164;: 0.36kb 
San2Kl-S3pl fragment 
of PJD330 


This 


work 



Em^, erythromycin resistant; Em®, erythromycin sensitive, 
15 Str^, streptomycin resistant; Rif^, rifampicin resistant; 
Nov^, novobiocin resistant. 
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2. General DNA techniques 

Techniques for DNA fragment isolation, ligations, 
and plasmid isolation and purification were performed as 
previously described (Dillard and Yother, 1991; and as 
5 described by Sambrook et al,, 1989, the relevant portions 
incorporated herein by reference) . Plasmid screenings 
were done by scraping colonies from agar plates and 
incubating these in the lysis solution of Kado and Liu, 
3% SDS, 50mM Tris, pH 12.6 (Kado and Liu, 1981). The 
10 lysates were run on agarose gels to determine plasmid 
sizes. 

3. Librairy construction 

A plasmid library of random fragments was 
15 constructed by digesting chromosomal DNA from S. 

pneumoniae strain WU2 to completion with Sau3AI and 

ligating these fragments into the BamHI site of pJY4163. 

The resulting ligation mixture was electroporated into E. 

coli LE392, and transf ozrmants were selected on L agar 
20 plates containing 300 fig erythromyc in/ml . Individual 

colonies were patched on erythromycin plates. Each plate 

contained 100 colonies and constituted a pool. 

Transf ormants were pooled by scraping the plates. 

25 4. Transformations 

Encapsulated strains of S. pneumoniae were induced 
to competence as has been described in Yother et al., 
1986, incorporated herein by reference. Non-encapsulated 
strains were made competent for transformation by growth 

30 in Todd Hewitt broth (Difco, Detroit, MI) supplemented 
with 0.01% CaCl2, 0.5% BSA, and 0.5% yeast extract. 
3. pneumoniae cells were allowed to express transforming 
DNA for 2 h. before plating on agar medium. 
Electroporation of washed E. coli LE392 cells 

35 resuspended in 10% glycerol was performed in a BTX 
Electro Cell Manipulator 600 according to the 
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instructions of the manufacturer (Biotechnologies and 
Experimental Research, Inc., San Diego, CA) . 

5. Preparation of S. pneumoniae chromosomal DNA 
5 Cultures of S. pneumoniae (100 ml) were grovm to 

stationary phase in the presence of 1% choline chloride 
to prevent autolysis (Briese and Hakenbeck, 1963). The 
. bacteria were centrifuged at 5000 rpm for 10 min and 
resuspended in 2.5 ml TE buffer (10 mM TrisHCl, 1 mM 

10 EDTA, pH 8.0). SDS was added to 1%, and the cells were 
lysed at SS^C for 15 min. One fifth volume 5 M KOAc (pH 
8) was added, and incubation was continued at 65 ®C 15 
min, followed by incubation on ice for 60 min. Cell 
debris was removed by centrifugation at 10,000 rpm for 10 

15 min, the supernatant was added to 2 volumes of ethanol, 

jand the DNA was hooked out with a glass rod. The DNA was 
dried, resuspended in TE, and further purified by 
CsCl/ethidium bromide buoyant density gradient 
centrifugation (Radloff et a2., 1967). 

20 

5. Recovery of plasmids resolved from S. pneumoniae 
chromosomes . 

A 10 ml culture of late log phase S. pneumoniae was 
centrifuged at 5000 rpm for 10 min. The supernatant was 

25 removed, and the cells were resuspended in 100 fil lysis 
buffer (Saunders and Guild, 1980) . Following a 5 min 
incubation at 37**C, 900 fil of Birnboim and Doly solution 
I was added, and the rest of the alkaline lysis procedure 
was carried out as for E. coli (Birnboim and Doly, 1979) . 

30 The resulting preparation contained very little plasmid 
DNA and was therefore electroporated into E. coli where 
significant quantities of plasmid could be obtained and 
isolated as described (Birnboim and Doly, 1979) . 

35 7. Mapping by chromosomal transformation 

The integration frequency was used to determine the 
linkage of spontaneous mutations. Chromosomal DNA from a 



^^^^^^^^^ 



Of 
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perfonned by the standard technique (Ausubel et al., 
1987). Monoclonal antibody 16.3 was used to detect type 
3 capsular polysaccharide (Briles et al., 1981a) . ELISA 
plates were read at 405 nm in a Biotek model 320 plate 
5 reader (Bio-Tek Instruments/ Winooski, VT) . 

9. Percoll gradient centrifugation 

For density determinations, 10 ml of log phase cells 
were centrifuged at 4000 rpm for 10 min, washed once with 

10 water, and then resuspended in 1 ml water, A volume of 
300 111 of cells was loaded on top of a 10 ml 0-100 or 
25-100% continuous Percoll gradient. Gradient density 
marker beads were loaded on top of the gradients. The 
gradients were centrifuged at 10,000 rpm for 15 min with 

15 the brake off. Percoll and density marker beads were 
purchased from Pharmacia (Piscataway, NJ) . 
Non-encapsulated strains of 5. pneumoniae exhibit a 
higher density in Percoll gradients than encapsulated 
strains (Briles et al., 1992). Percoll gradients were 

20 also used to enrich for encapsulated cells expected to 

result from low frequency events, Percoll gradients were 
used to obtain binary capsule type transformants and to 
enrich for spontaneous revertants to capsule production. 

25 

B» Results 

To identify the type 3 capsule region of 
pneumoniae, a Sau3AI library of fragments was cloned from 
the type 3 encapsulated strain WU2 to direct insertions 

30 into the chromosome of strain WU2 . The library used in 
the insertion-duplication mutagenesis procedure was 
constructed by cloning random Sau3AI fragments in the 
plasmid pjy4163/ which carries an erythromycin -resistance 
marker. Since this plasmid is unable to replicate in S. 

35 pneumoniae, all erythromycin- resistant transformants 

should contain insertions at the chromosomal site of the 
target Sau3AI fragment- (Morrison et al., 1984). By 
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transforming the library of clones into strain WU2, the 
inventors obtained 5 non- mucoid isolates among 451 
erythromycin-resistant transf ormants . However, further 
studies involving transformation of the parent strain 
5 with either chromosomal DNA from the mutants or plasmids 
recovered from the mutants showed that the plasmid 
insertions were neither linked to nor responsible for the 
capsule mutations. 

10 To determine if the non-mucoid isolates were truly 

deficient in the production of type 3 capsular 
polysaccharide, several methods were employed. In 
slide -agglutination assays, none of the five mutants 
reacted with polyclonal antisera specific for type 3 

15 polysaccharide. Centrifugation through Percoll density 
gradients revealed that the mutant strains were much 
denser than the encapsulated parent strain, WU2 cells 
had a density <1.01 g/ml, whereas all five mutants 
migrated at 1.09 to 1.10 g/ml. These data suggested that 

20 complete capsules were not produced by the mutants. 

However, these tests might not reveal the presence of 
short or sparse polysaccharide chains on the cell 
surfaces or capsular material not translocated to the 
surfaces. To determine if such material was present, 

25 ELISA analysis was carried out on crude cell extracts. 
Capsular material in the extracts was detected using a 
monoclonal antibody directed against type 3 
polysaccharide. The analysis indicated that mutants 
JD531 and JD541 (designated mutants Al and A2 

30 respectively) made no detectable capsular material, 
whereas mutants JD542, JD551, and JD571 (designated 
mutants Bl, B2 and B3 respectively) made significant 
levels of reactive material. The common laboratory 
strain Rxl was also found to produce significant levels 

35 of type 3 capsular material (PIG, 1) . Although Rxl is 

generally referred to as a non -encapsulated derivative of 
the type 2 strain D39, it was transformed three times 
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with chromosomal DNA from derivatives of the type 3 
strain A66, chosen twice for type 3 encapsulation, and 
chosen finally for non- encapsulation (Ravin, 1959; 
Shoemaker and Guild, 1974) . 

5 

No capsular material could be detected by ELISA in 
the culture supernatant fluids of mutants JD531 and 
JD541, indicating that these mutants were not merely 
defective in attachment of the polysaccharide. Only low 
10 levels of capsular material were detected in supematants 
of Rxl, JD542, JDSSl, and JD571 cultures. 

The five mutations resulting in the 
capsule-deficient phenotype were mapped to two loci by 

15 chromosomal transformation. Reciprocal crosses between 
the mutants yielded encapsulated transf ormants for each 
combination, but not for transformation of a mutant with 
DNA from the same strain. The mutations were thereby 
determined to be genotypically distinct- The 

20 transformations also revealed that the mutations in JD531 
and JD541 were more closely linked to each other than to 
the mutations in JD542, JDSSl, and JD571. Likewise, the 
mutations in JD542, JDSSl, and JD571 were more closely 
linked to one another than to the other two mutations. 

25 The integration frequencies for those mutations judged to 
be closely linked were similar (0,02 to 0.03) and were 
ten -fold lower than those judged to be not as closely 
linked. The genotypic data thus agreed with the 
phenotypic data, i.e., the two mutations leading to total 

30 loss of capsule synthesis mapped together, and the three 
mutations causing lack of proper capsular polysaccharide 
processing mapped together. The loci containing these 
mutations were temporarily designated cpsA and cpsB, 
respectively, and the mutations were named as indicated 

35 in FIG. 1. 
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Table 4 shows the transf onnation frequencies of 
capsule mutations. S. pneumoniae strains were 
transformed with chromosomal DNA from strain JD636, a 
streptomycin resistant WU2 (Table 3) , and streptomycin- 
5 resistant transf ormants were selected. Transformation 
frequencies are calculated from cultures not induced to 
competence. With optimal induction, strain WU2 may 
exhibit a transformation frequency approaching that of 
the non-encapsulated mutants. However, during the 

10 mutagenesis procedure, sub-optimal transformation 

frequencies were observed (0.0003 to 0,006%). JD908 
(Table 3) was also included, it contains an insertion 
mutation resulting in loss of capsule expression 
(described in FIG. 4) . It would appear that the 

15 non-encapsulated mutants are highly transformable, 

suggesting that the reason for their over- representation 
in the original transformant population was because of 
their transf ormability. The mutagenesis procedure, by 
selecting for trarisformability, has enriched for mutants 

20 already deficient in capsule (Table 4) . 

Table 4, Transformation frequencies of capsule mutants « 



Recipient 


Mutation 


Total 


cfu 


Strj^ cfu^ 


Transfor- 
mation 
frequency 
{%) 


JD531 


cpsAl 


2.0 X 


10^ 


5.0 X 10^ 


0.3 


JD541 


cpsA2 


1.4 X 


10® 


7.6 X 10^ 


0.6 


JD542 


cpsBl 


2.4 X 


105 


20 X 10^ 


0.8 


JD551 


cpsB2 


2.0 X 


105 


4.6 X 10^ 


0.2 


JDBVl 


cpsBS 


2.4 X 


10® 


5.0 X 10^ 


0.2 


WU2 


wt 


0.0 X 


10^ 


1 


0.0000 


JD908 


cpsS 


10 X 10^ 


10 X 10^ 


0.1 



Str*^ streptomycin resistant. 



35 



wo 95/31548 



PCT/US9S/06119 



- 89 - 
EXAMPLE 2 

Identification of a Clone Containing a Capsule Gene 

To identify DNA fragments capable of repairing the 
5 cpsAl mutation, JD611, a derivative of JD531 lacking the 
pjy4163 insertion, was transformed with pools of pjy4163 
clones containing Sau3AI fragments from strain WU2. 
Transformations and DNA manipulations were performed as 
described in Example 1. In this insertion-duplication 

10 restoration procedure, the plasmid clone is inserted into 
the mutant chromosome, leading to duplication of the 
homologous target fragment and restoration of one wild 
type copy of cpsA (FIG. 2) . Erythromycin-resistant 
transformants were screened visually for the mucoid 

15 phenotype. One plasmid clone was identified which 

restored encapsulation in the cpsAl-containing mutant. 
Due to the duplication of the target fragment, the 
plasmid insertion could resolve out of the chromosome by 
homologous recombination at low frequency. Therefore, 

20 transformation of E. coli with DNA from the encapsulated 
transformant and selection for erythromycin-resistance 
allowed recovery of the plasmid, designated pJD330, that 
had repaired the cpsAl defect, 

25 Transformation of the capsule-deficient mutants with 

pJD330 suggested that the clone contained part of cpsA. 
When pJD330 was inserted into the chromosome of the 
cpsAI- containing mutant, 56% of the 

erythromycin-resistant transformants became encapsulated 
30 (Table 5) . The failure of the remainder of the 

transformants to become encapsulated indicated that the 
cloned fragment contained only one end of the gene. The 
site of the crossover relative to the site of the 
mutation determines whether the mutation will be located 
35 in the incomplete copy of the gene or the full-length 

copy. If the recombination occurs on the left, as shown 
in FIG. 2, the full-length gene is wild type and capsule 
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is restored. However, if the crossover occurs on the 
right, the mutation is located in the full-length copy, 
and no capsule is obtained. This interpretation is 
supported by the observation that trans formants of the 
5 cpsAl mutant which incorporated the plasmid but did not 
become encapsulated, spontaneously gave rise to 
encapsulated, erythromycin-resistant colonies, either by 
excision and reinsertion of the plasmid or by gene 
conversion. The cpsA2 defect was not repaired by pJD330, 
10 suggesting that the site of this mutation was either not 
present on the plasmid clone or was located too near the 
end of the clone for crossover to repair the defect. 

Table 5. Restoration of encapsulation with pJD330.^ 

15 



Recipient 


Mutation 


Ery^ cfu^ 


Ops'*" cfu 


Cps+ 

frequency 
(%) 


JD611 


cpsAl 


475 


267 


56 


JD619 


cpsA2 


26 


0 


0 


JD692 


cpsBl 


13 


0 


0 


JD614 


CP8B2 


124 


0 


0 


Rxl 


cps" 


158 


49 


31 


661 


capD6 


56 


0, 


0 


A66R2 


capD4 


4 


3 


75 



^ Mutants deficient in capsule production were 

transformed with pJD330 DNA. Transf ormants were 
plated on erythromycin to select for those 
containing chromosomal insertions of pJD330. 
30 Erythromycin-resistant trans foirmants were screened 

for mucoldy. Cps'*' frequency is the ratio of 
Cps^/Ery^ cfu. 

^ Ery^. Erythromycin resistant. 



35 



Transformation of pJD330 into the cpsB- containing 
mutants did not restore any of these to encapsulation, 
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suggesting that cpsB is not present on pJD330. However, 
strain Rxl was restored to type 3 encapsulation by pJD330 
(Table 5) . 

5 EXAMPLE 3 

cpsA Codes for UDP-Glucose Dehydrogenase 

Transformation of two previously characterized 
mutants suggested that pJD330 contains part of the gene 

10 for UDP-glucose dehydrogenase. UDPG dehydrogenase is the 
enzyme which converts UDP-glucose (UDPG) into 
UDP-giucuronic acid (UDPGA) . UDPG and UDPGA are the two 
nucleotide sugars which are required for type 3 capsule 
synthesis (Smith et al., 1960). The non- encapsulated 

15 mutants A66R2 and 661 were previously shown to be 

deficient in the production of UDPG dehydrogenase due to 
mutations in the locus designated capD (Bernheimer and 
Wermundsen, 1972) . Transformation of A66R2 (capD4) with 
pJD330 restored encapsulation, whereas transformation of 

20 661 {capD6) did not (Table 5) • 

Transformations with chromosomal DNA (Table 6) 
confirmed that the other cpsA and capD mutations were 
closely linked to the region cloned in pJD330. 

25 Transformations and other DNA manipulations were 

performed as described in Example 1. Strain JD770 was 
obtained by inserting pJD330 into the chromosome of the 
parental strain WU2 . JD770 was found to produce wild 
type amounts of type 3 capsule. Chromosomal DNA from 

30 JD770 was used to transform those mutants which could not 
be restored to encapsulation by pJD330 (Table 6) . The 
cp3A2 and UDPG dehydrogenase mutation capDS were found to 
be >90% linked to the plasmid insertion. 
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Table 6. Restoration of encapsulation with chromosomal 
D23A linked to pJD330 insertion.^ 



Recipient 


Mutation 


Ery^ cfu^ 


Cps**" cfu 


frequency 
(%) 


JD692 


cpsBl 


17 


13 


70 


JD614 


cpsB2 


79 


58 


73 


JD619 


cpsA2 


42 


39 


93 


661 


capD6 


6 ■ 


6 


100 



IQ 

^ Transformants were plated on erythromycin to select 
for those which had incorporated the region 
containing the insertion. Erythromycin-resistant 
transformants were screened for mucoldy, Cps* 
15 frequency is the ratio of Cps'*"/Ery^ cfu. 

^ Ery^ erythromycin resistant. 



Deletion analysis was performed to more closely 
20 localize the sites of the mutations cpsAl, capD4, and the 
mutation in Rxl (cps-) . By transforming with plasmid 
subclones and making no selection for insertion of the 
plasmids, the inventors were able to observe 
recombination events that occurred as a result of double 
25 crossovers between the cloned fragment and its homolog in 
the chromosome (FIG. 3A) . Transformations with several 
subclones revealed that the sites of the mutations could 
all be localized to a 250 bp PvuII-SspI fragment common 
to pJD380 and pJD369 (FIG. 3B) . The fact that the same 
30 fragment which restores encapsulation in a UDPG 

dehydrogenase mutant also restores encapsulation in the 
cpsAl -containing mutant suggests that cpsA encodes UDPG 
dehydrogenase. From here on, the cpsA loci is designated 
cps3D (see Example 9) . 



It is known that transformation of UDPG 
dehydrogenase mutants, including 661, with chromosomal 
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DNA from a type 1 strain restored type 3 encapsulation by 
incorporation of the type 1 specific genes at a site 
other thai! that occupied by the type 3 genes (Bernheixner 
et al., 1967, Bemheimer and Wermundsen, 1972) . Type 1 
5 capsular polysaccharide contains galacturonic acid; 

therefore, type 1 strains are expected to produce UDPG 
dehydrogenase (Austrian et ai., 1959). When UDPG 
dehydrogenase mutants of a type 3 strain were transformed 
with DNA from type 1 strains, the UDPG dehydrogenase from 

10 type 1 complemented the type 3 mutation, allowing the 
production of both capsular polysaccharides. The UDPG 
dehydrogenase gene from the type 1 strain was never 
observed to repair the mutation in the type 3 gene 
(Bemheimer and Wermundsen, 1969) . When the cpsAl mutant 

15 JD611 was transformed with chromosomal DNA from a type 1 
strain, type 3 encapsulated transf ormants were obtained 
at a frequency of 3 X 10*^. This frequency is in 
agreement with that observed for transformation of mutant 
661 {capD6 ) to binary encapsulation (Bemheimer and 

20 Wermundsen, 1972) and above the spontaneous reversion 
frequency (<8 X 10'^) . 

EXAMPLE 4 

Genetic and Physical Map of the Tvoe 3 Capsule Region 

25 

A. Methods 

I. Southern Blotting , 

Southern blotting was performed using the vacuum 
blotter and chemiluminescent detection system purchased 

30 from Stratagene (La Jolla, CA) . The PosiBlot® 30-30 

pressure blotter is part of an integrated system designed 
to transfer DNA or RNA from agarose gels quickly and 
efficiently onto solid support matrices, such as 
Stratagene' s hybridization membranes including the 

35 Nitrocellulose membranes, Duralose-uv'" membranes 

(reinforced nitrocellulose) , Duralon-UV™ membranes 
(nylon) or Illuminator'" nylon membranes. 
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Following electrophoresis the gels are stained in 5 
/ig/ml of ethidium bromide (EtBr) in water, destained in 
water and then photographed. Prior to Southern transfer , 
the gels are pre treated by depurination, denaturation and 
5 neutralization. Depurination entails treating the gels 
with 0.25 N HCl for 5-30 minutes with gentle shaking. 
Denaturation consist of pouring off the HCl and adding a 
0.5 N NaOH and 1-5 M NaCl denaturation solution/ enough 
to cover the gel- The gels are treated for 5 minutes to 
10 one hour with gentle shaking. Neutralization involves 

pouring off the denaturation solution and adding a 0.1 M 
Tris-HCl (pH 7.5) and 1.5 M NaCl neutralization solution, 
enough to cover the gel. They are then treated for 5 
minutes to one hour with gentle shaking. 

15 

Gels are then ready for blotting, which is performed 
with gloved hands. The membrane is prewet in distilled 
water (dHjO) and then in transfer buffer for 5 minutes. 
lOx SSC buffer-lOx SSPE buffer or 25 mM sodium phosphate 
20 (pH 6.5) is the transfer buffer for nylon membranes. For 
nitrocellulose or Duralose-UV membranes, 20x SSC buffer 
should be used. 

The membrane and gel are set up in the Posiblot 30- 
25 30 pressure blotter and pressure exerted and adjust to 75 
mm Hg. Blotting times vary for different gels and depend 
on the amount and size of the nucleic acid; size, 
thickness and percentage of gel; and depth of gel wells 
and volume of sample loaded on the gel; which are 
30 routinely optimized. After the allotted blotting time, 
the position of the wells on the membrane is marked and 
the gel removed. The gel is generally stained and 
destained in ethidium bromide to check the efficiency of 
transfer. The membrane is removed from the device and 
35 placed on clean Whatman 3MM paper to allow the excess 
buffer to be absorbed. Once the membrane is free of 
standing liquid, but still damp, the membrane and the 
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Whatman 3MM paper is placed under a UV light and 
crosslinked. Alternatively, dry the membrane in a 80*C 
drying oven for 1-2 hours prior to crosslinking. 

5 Boehringer Mannheim's Genius Nonradioactive 

Detection System, a chemiluminescence-based, nucleic acid 
detection kit, permits fast, safe and sensitive detection 
of DNA and RNA immobilized on nylon membranes. As little 
as 0*1 pg of target plasmid DNA can be detected in a 30- 
10 minute e3q)osure of the processed blot to X-ray film or, 
in a similar exposure time, 1 pg of a single- copy gene 
can be detected in less than 1.0 /xg of genomic DNA. The 
Nonradioactive Detection System can also be used for 
rapid Northern-blot analysis of RNA. 

15 

After transfer and crosslinking, the membrane is 
prehybridized for 1 hour at 42 ®C. The labeled probe is 
placed in a microfuge t\ibe containing 100 /xl of sonicated 
salmon sperm DNA (10 mg/ml) stock and heated in a boiling 

20 water bath for 5 minutes. This is pulse-spun to collect 
condensation and stored on ice until ready to add to 
hybridization. The probe is added to prehybridization 
solution and hybridized, with shaking, overnight at 42®C 
using standard hybridization solutions as described by 

25 the Genius® protocol. This is washed once for 15 minutes 
at room temperature in O.lx SSC/0.1% SDS and then washed 
twice for 15 minutes at 60^C in O.lx SSC/O.li SDS for 
each wash. The probe is then ready for detection. The 
BRL 1 Kb DNA ladder was used as a molecular weight size 

30 standard (Bethesda Research Laboratories, Gaithersburg, 
MD) . Biotin labeled probes for hybridization were 
prepared by nick-translation using the BRL BioNick kit 
(Bethesda Research Laboratories) . High stringency 
conditions, as described above should result in the 

35 detection of sequences a95% homologous to the probe. 

Reduced stringency was achieved by lowering the wash, or 
hybridization and wash temperatures to room temperature. 
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At reduced stringency, sequences with 85% homology to the 
probe should have been detectable. 

B. Results 

5 Using pJD330 as a probe in Southern hybridizations, 

a physical map of the type 3 capsule region of strain WU2 
was developed (FIG. 4) . Using the information gained 
from the chromosomal mapping, the inventors identified 
and cloned the Hindlll fragment located to the right of 

10 the pJD330 insert, Hindlll fragments approximately 3 kb 
in size were cloned from the WU2 chromosome into pJY4164. 
By using pJD330 to screen for homology, a clone 
containing a 3.2 kb insert was identified. This clone, 
pJD366, was then used to determine the location of the 

15 cpsB mutants. 



Transformation mapping using JD770 showed that the 
cpsB mutations were about 74% linked to the pJD330 
insertion (Table 6) . This high frequency indicated that 

20 cpaB might be adjacent to the fragment contained in 
p«JD330. When piJD366 was used to transform strains 
containing the cpsB mutations, encapsulation was not 
restored. Insertion of pJD366 into the WU2 chromosome 
did not alter the production of type 3 capsule; 

25 therefore, the inventors were able to examine linkage of 
the insertion to the cpsB mutations and determine the 
relative location of cpsB. The pJD366 insertion was 
found to be only 25% linked to the cpsB mutations (as 
compared to 74% for the pJD330 insertion) , suggesting 

30 that cpsB is located to the left of the pJD330 insert, as 
shown in FIG. 4, 



In order to localize regions necessary for capsule 
production, insertion mutations using subclones of pJD330 
35 and pJD366 were made. Transformation of strain WU2 with 
plasmids containing fragments internal to a gene or 
operon required for capsule production should result in 
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loss of encapsulation. Insertion of the plasmid 
containing the Sau3Al - Xbal fragment resulted in loss of 
encapsulation, indicating that this entire 1.6 kb 
fragment is within a single gene or operon required for 
5 capsule synthesis. Likewise, all insertions within this 
region eliminated capsule production {FIG. 4) . Insertion 
of the plasmid containing the Xbal-PstI fragment did not 
disrupt capsule production, indicating that the end of 
the gene or operon is contained within this fragment. 
10 None of the other insertions resulted in loss of capsule ^ 
indicating they were not internal to genes or operons 
required for capsule synthesis. 

Since the plasmids used for the chromosomal 
15 insertions contain a promoterless cat gene, the inventors 
were able to establish the directions of transcription at 
the insertion sites. All insertions which contained the 
cat gene in the orientation to detect transcription 
proceeding to the right (as drawn in FIG. 4) resulted in 
20 chloramphenicol resistance. No transcription was 
detected in the opposite direction (FIG. 4) . 

EXAMPLE 5 
Homolocrv with Other Capsule Types 

25 

If the type-specific genes for capsule production 
are contained within a cassette, as has been proposed, 
these genes should show little homology to the 
type-specific genes from other capsule types* A high 

30 degree of homology should exist in the regions flanking 
the type-specific region (Austrian et al., 1959; 
Bemheimer and Wermundsen, 1972) , The flanking regions 
may contain common genes necessary for production of 
capsule of any type or might not be involved in capsule 

35 production. 
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To determine if the regions cloned in pJD330 and 
pJD366 are specific to type 3 or are present in strains 
of other capsule types, Hindlll digested chromosbmal DNAs 
from strains of types 2, 3, 5, 6A, 8, 9, and 22 were 
5 Southern blotted and probed with these plasmids or 

fragments thereof. The fragment contained in pJD330 (the 
probe used was pJD351, containing 2.4 kb Sau3Al fragment 
from pJD330) hybridized only with DNA from the type 3 
strain, detecting the expected bands at 2.2 and 3.2 kb. 
10 No hybridization with the chromosomal DNA of the other 
six serotypes was detected, nor could the stringency be 
sufficiently lowered to detect homology in these strains. 

When Hindu I digests of chromosomal DNAs from these 
15 same strains were probed with the Hindlll fragment from 

pJD366, the expected 3.2 kb band was observed in the type 
3 strain, but a 1.1 kb band was found in every other 
capsule type. Probing with subclones of pJD366 
containing the 2.1 kb Hindlll -Sad fragment or the 1.2 kb 
20 Sacl-Hindlll fragment revealed that the homology resided 
in the more distal 1.2 kb fragment. Therefore, unlike 
the remaining 4.2 kb of DNA, which could be detected only 
in the type 3 strains, the 1.2 kb Sacl-Hindlll fragment 
(pJD377) showed a high degree of homology and could be 
25 detected at high stringency in all strains (2, 3, 5, 6A, 
8, 9 and 22) . This result suggests that this region may 
be the highly homologous flanking DNA predicted by the 
model to be adjacent to the type-specific genes. 

30 EXAMPLE 6 

Transformation of Capsule Type 

To determine if all the type -specific genes 
necessary for the production of type 3 capsular 
35 polysaccharide were closely linked on the pneumococcal 
chromosome, strain JD770 was used as a donor in 
transformation of the type 2 strain D39. Laboratory 
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techniques were as described in Example 1. Seventy- three 
erythromycin-resistant transformants were obtained, and 
all 73 expressed type 3 capsule. No type 2 capsule could 
be detected by agglutination with type 2 specific 
5 antisera. Using chromosomal DNA from strain JD770, 

succesful transformation of strains of type 5 and type 6B 
to type 3 encapsulation was also perfomed (Example 18) . 



By transforming the type 2 strain D39 with pJD366, 
10 isolates were obtained with the erythromycin-resistance 

marker closely linked to the type 2 capsule genes. These 
transformants were the result of recombination between 
the flanking regions of homologous non- type -specific DNA. 
Using DNA from one of these isolates, JD871, to transform 
15 the type 3 strain WU2 resulted in 95% co-transformation 
of type 2 encapsulation with erythromycin resistance. 
The remaining 5% were found to be type 3 encapsulated, 
indicating that only the flanking DNA or the plasmid 
alone was transferred to these isolates. Insertion of 
20 pJD366 into the type 5 strain DBL5 also resulted in a 

strain - JD875 - with the erythromycin resistance marker 
linked to the type-specific genes. This strain was 
successfully used to transform WU2 to type 5 
encapsulation . 



25 



EXAMPLE 7 
Direct Test of the Caaaette Model 



Transformations and other DNA manipulations were 
30 performed as described in Example 1. Southern Blotting 
was performed as described in Example 4. 

If capsule type change involves a cassette-type 
recombination mechanism, then transformation of capsule 
35 type should result in replacement of the recipient's 

type -specific genes by those of the donor, in order to 
determine if such replacement does occur, DNA was used 
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from the type 3 -specific region to probe Hindlll digested 
chromosomal DNA from a strain which was originally type 2 
and was transformed to type 3 (JD803), and a strain which 
was originally type 3 and was transformed to type 2 
5 (JD872) (FIG. 5) . 

Hind III digested chromosomal DNA from strain: 2 
(D39 and JD871 from Example 6); 3 (WU2) ; 3/2 (iJD872) and; 
2/3 (JD803) , were used in Southern blotting. First of 

10 all the Southern blot was probed with pJD343 and pJD368 
Together these plasmids contain an 800 bp region 
{Haelll-Munl) specific to type 3 and internal to cpsS 
(FIG. 5) . The type 3 parent WU2 contained the expected 
2.4 Icb Hindi 1 1 fragment specific to type 3, whereas 

15 neither the type 2 parent D39 nor its derivative JD871, 
which has pJD366 inserted into the chromosome, contained 
this fragment. When JD871 (type 2) was used to transform 
WI72, the resulting strain JD872 was type 2 encapsulated 
and had lost the 2.4 kb type 3 -specific fragment. 

20 Similarly, when D39 was transformed with DNA from JD770 
(type 3), the resulting strain JD803 was type 3 
encapsulated and had gained the type 3 -specific fragment. 

25 Reprobing of the same blot with the 1.2 kb 

Sacl-Hindlll fragment common to all capsule types 
(pJD377> , revealed that the 1.1 kb Hindlll fragment was 
present in each of the strains that now produced the type 
2 capsule. Further, JD803 had also gained the 3.2 kb 

30 Hiiidlll fragment present in WU2 (type 3), 2.1 kb of which 
is type 3 -specific. This fragment was also present in 
JD871 and JD872 since it is contained in the plasmid 
insert (FIG. 5) . 

35 The loss of type 3 genes by the strain converted to 

type 2 encapsulation indicated that capsule type change 
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does not occur by addition, but rather by replacement of 
the type-specific genes, 

EXAMPLE 8 

5 The PNA Sequence and Amino Acid Sequence of CM3D. 

A. Methods 

1. DNA Sequencing 

10 Templates for sequencing were prepared from double- 

stranded plasmid DNA by denaturing with NaOH (2 N) for 5 
min at room temperature, and precipitating with 5 M 
NH^OAc and ethanol. DNA was secjuenced by the Sanger 
dideoxy method using the Sequenase 2 . 0 kit (US 

15 Biochemicals, Cleveland, Ohio) and ^^S-dATP. The 

oligonucleotide primers 5' -GCCACTATCGACTACGCG-3' (SEQ ID 
NO : 17 ) and 5 ' TCATTTGATATGCCTCCG- 3 ' ( SEQ ID NO : 18 ) , 
corresponding to bases 308 to 325 and 445 to 428 of the 
cloning vectors pJY4163 and pJY4164 (Yother, et al., 

20 1992), respectively, were used routinely. Primers 5'- 
GTGAGATAAATAGTAGTGCG-3' (SEQ ID NO: 19) and 5'- 
TCCAGCTCGTGTCATAATCT-3' (SEQ ID NO: 20), corresponding to 
bases 3474 to 3493 and 3596 to 3577, respectively, of the 
type 3 capsule locus (FIG. 6G) were also used. All 

25 oligonucleotide primers were purchased from Oligos, etc. 
(Wilsonville, OR) . DNA sequencing of PGR products was 
performed using the US Biochem PGR product secpiencing 
kit, according to the directions of the manufacturer • 
PGR products were sequenced at least twice, from separate 

30 amplification reactions. Greater than 97% of the 
sequence was obtained for each strand. 

2. Sequence analysis 

35 The University of Wisconsin Genetics Computer Group 

programs (Genetics Computer Group, 1991) were used in the 
analysis of the DNA sequence. Database searches were 
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cp33D was amplified from a 3.5 kb EcoRV fragment from the 
WJ2 chromosome. 

To isolate the region 5' of the repeat sequence, a 
5 SacI'Mscl fragment internal to the repeat region 

(extending from nucleotide 1 to 257 of SEQ ID NO: 4) was 
first cloned into the insertion vector pSFlSl (kanamycin 
resistant, Km^) , and used to direct an insertion- 
duplication event into the type 3 5. Pneumoniae WU'2 

10 chromosome. Chromosomal DNA from the resulting Km^ 

strain, JD1008, was digested with Hindlll, self -ligated, 
and transformed into the E. coli. The resulting Km^ 
plasmid, pRSlll, contained in the pSFlSl vector and DNA 
flanking the insertion, i.e,, DNA extending from the 

15 Hindlll site in cpsSB to the Hiiidlll site in the repeat 
sequence (-2.3 kb of S. pneumoniae DNA). 



B. Results 

The cpaSD nucleotide sequence is shown in FIG. 6E 
20 (SEQ ID N0:5) . The CpsBD amino acid sequence : (SEQ ID 
NO: 11) is highly homologous (56% identity, 73% 
similarity) to that of the UDP-glucose dehydrogenase 
(HasB) from Streptococcus pyogenes (Dougherty and 
van de Rijn, 1993) . Two other sequences were detected in 
25 the GenBank which shared a high degree of homology with 
Cps3D. These open reading frames from the Escherichia 
coli and Salmonella entexitica rfb clusters have not been 
shown biochemically or genetically to be UDP-glucose 
dehydrogenases (Bastin, et al., 1993), but they share a 
30 high degree of homology with HasB and Cps3D. 

Cps3D (SEQ ID NO: 11) has several characteristics 
consistent with it being UDP-glucose dehydrogenase. The 
N- terminal amino acid residues 2 to 29 have all the 
35 characteristics of an NAD-binding site (Wierenga, et al., 
1986) , and this sequence is very homologous to regions 
from HasB, AlgD (the GDP-mannose dehydrogenase of 
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Pseudomonas aeruginosa [Deretic, et al., 1987]), and the 
two potential UDP-glucose dehydrogenases from E. coli and 
S. enteritica. The homology with AlgD was previously 
noted by Garcia et al., in the deduced amino acid 
5 sequence of the S. pneumoniae gene cap3-l (Garcia, et 
al., 1993). They suggested that Cap3-1 was the type 3 
UDP-glucose dehydrogenase. Sequence ID N0:1 and SEQ ID 
NO: 5 is in complete agreement with that of Garcia et al., 
from the EcoRV site to the Seal site (nucleotide 883 to 
10 1377 FIG. 6D and FIG. GE, containing amino acids 1 to 

117, SEQ ID NO: 11). However, no other homology was seen, 
suggesting that these investigators had cloned only the 
5' end of the gene. 

15 The Cps3D sequence at amino acid residues 251 to 263 

(SEQ ID NO: 11) is consistent with this being the active 
site of the enzyme. This region is identical at the 
amino acid level with that of HasB and the putative E. 
coli and 5. enteritica UDP-glucose dehydrogenases. The 

20 homology of the active site region of HasB with that of 
bovine UDP-glucose dehydrogenase and AlgD has been fully 
described (Dougherty and van de Rijn, 1993) . The . 
cysteine at residue 259 (SEQ ID NO: 11) of Cps3D contains 
the essential thiol group of the reactive site (Ridley, 

25 et ai., 1975). The predicted size of Cps3D (45 kDa) is 
also similar to the size of the E. coli enzyme (47 kDa) 
(Schiller, etal., 1976). 

EXAMPLE B 

30 Identification of Capsule Mutants 

DNA sequencing and manipulations were performed as 
described in Example 8. 

35 To determine the nature of the two cpsA mutations, 

identified in Example 1, the regions were amplified from 
the chromosomes of the mutant strains and secjuenced. 
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Bach mutant (JD611 and JD619) contained a single base 
pair transversion resulting in a premature stop codon in 
the cpsBD sequence. The locations of the mutations are 
indicated in FIG, 6E. 

5 

To localize the three cpsB mutations, also 
identified in Example 1, located upstream of the UDP- 
glucose dehydrogenase mutations (cpsA) , standard PGR or 
chromosome crawling was used to amplify fragments from 

10 the parent type 3 chromosome that contained either the 
5' end coding sequence of cpsSD (nucleotide 1027 to 1802, 
FIG. 6E) , the promoter and the 5' end of cps3D 
(nucleotide 885 to 1802, FIG. 6D and FIG. 6E) , or the 5' 
end of cps3D plus approximately 1 kb of upstream DNA 

15 (nucleotide 1 to 1802) • Each of these fragments was used 
to transform the capsule-deficient mutants JD614 and . 
JD692. JD692 could be transformed to encapsulation using 
the 5' end coding sequence of cpsBD, whereas JD614 was 
not restored to encapsulation by this fragment but was 

20 restored by the fragment containing the 5' end plus 141 

bp of upstream DNA, including the promoter. Both of the 
mutants were restored by the 1.8 kb fragment containing 
the 5' end of cps3D and the upstream DNA, and neither was 
restored with a fragment containing the 3' end of cpaSD 

25 (nucleotide 1759 to 2385, FIG. 6E) . Thus, these upstream 
mutations are not located in a separate gene but are in 
either the cp83D structural gene or its promoter. Since 
some capsule material is produced by these mutants, a 
mutation within the coding region (as in iJD692) must be a 

30 missense mutation or an in-frame deletion or insertion 
which reduces the activity of the enzyme. The mutation 
in JD614 may be in the promoter, and thus, a promoter 
down mutation, or it may be in the structural gene but 
too close to the beginning of the gene for recombination 

35 and repair to occur with the fragment used. 
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Amplification and sequencing of the 250 bp PvuII- 
Sspl fragment from the mutant strains A66R2 and Rxl 
showed that each contained a missense mutation in the 
CP83D coding sequence (FIG. 6E) • 

5 

EXAMPLE 10 
DNA Sequences of cpaSS and cdbSU 

DNA sequencing and analysis was performed as 
10 described in Example 8. 

The region just downstream of cpsSD contains a 
second gene, cps3S, that is required for type 3 capsular 
polysaccharide biosynthesis. An open-reading frame, 1248 

15 bp in length, is transcribed in the same direction as 
cps3D and is in the same reading frame (SEQ ID N0:5) . 
The direction of transcription is in agreement with that 
determined using cat insertions as described in Example 
4. Only 15 bp separate a potential start codon for cpsSS 

20 from the stop codon of cpsSD. The sequence AGGG6 just 
upstream of the putative start codon may serve as a 
ribosome binding site (FIG. 6E) , or due to the close 
proximity of cps3D, no ribosome binding site may be 
necessary. The deduced amino acid sequence of Cps3S 

25 predicts a protein of 48 kDa (SEQ ID N0:12) , if the first 
start codon at nucleotide 1 (nucleotide 2227 in FIG. 6F) 
is utilized. Other potential start codons are located at 
nucleotide 1 plus 19 and +61 (nucleotide 2245 and 2287 
PIG. 6F, respectively) , however neither of these are 

30 positioned near a ribosome binding site. 

A short region of dyad symmetry was detected 
downstream of cps3S at nucleotide 3718 to 3738 (PIG. 6F) . 
The scores for primary and secondary structure yielded by 
35 the TERMINATOR program of the GCG sequence analysis 

package (p=3.95, s=22) suggest that this region could 
function as a weak rho- independent terminator. However, 
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this sequence is 241 bases past the cps3S stop codon and 
closer to the start of the next open reading frame, i.e., 
cpa3U. Therefore, the sequence may have more to do with 
the e3(pression of cpeSU than of cpaSS, In fact, a 
5 potential promoter sequence for cps3V was detected 

upstream of the region of dyad symmetry, suggesting the 
potential structure could serve as an attenuator of cps3U 
expression. The cpsU open reading frame (SEQ ID N0:5) , 
918 bp in length, is transcribed in the same direction as 
10 cps3D and cps3S, and is predicted to encode a protein of 
34 kDa (SEQ ID N0:13) . 



EXAMPLE 11 

CPS3S i s Homologous to PolvBaccharlde Synthases 

15 

A search of the GenBank revealed that the predicted 
Cps3S protein (SEQ ID NO: 12) is homologous to 
polysaccharide synthases. The greatest degree of 
homology was found with HasA, the hyaluronic acid 

20 synthase from S. pyogenes (23% identity, 50% similarity) 
(DeAngelis, et al., 1993b; Dougherty and van de Rijn, 
1994) . Hyaluronic acid consists of alternating N-acetyl 
glucosamine and glucuronic acid residues. Hyaluronic 
acid and the pneumococcal type 3 capsule are similar in 

25 structure in that both are composed of /3(l-4) linked 

repeating disaccharide units containing glucuronic acid. 
Like pneumococcal type 3 capsule, hyaluronic acid capsule 
contains both /3(l-3) and P(l-4) linkages, however the 
linkage to glucuronic acid is /S(l~4) in hyaluronic acid 

30 but )S{l-3) in type 3 capsule (Reeves and Goebel, 1941). 
Homology was also seen between Cps3S and Node from 
Rhizobium meliloti (21% identity, 47% similarity) . Node 
is necessary for the synthesis of nodulation factor, a 
substituted oligosaccharide consisting of /S{l-4) linked 

35 N-acetyl glucosamine residues (Lerouge, et al., 1990). 
It has previously been noted that HasA and Node are 
homologous to polysaccharide synthases, including FBF15 
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of Stigmatella aurantiaca, pD642 of Xenopus laevis, and 
chitin synthases from both Saccharomyces cerevisiae and 
Candida albicans (DeAngelis^ et al., 1993b; Dougherty and 
van de Rijn, 1994; Atkinson and Long, 1992; DebellS, et 
5 al., 1992). Cps3S is also homologous to these proteins. 
These results suggest that C?ps3S is the type 3 capsular 
polysaccharide synthase. 

The PILEUP program was used to align the amino acid 
10 sequences of the bacterial polysaccharide synthases Cps3S 
(SEQ ID NO: 12), HasA, NodC, and FBF15. Only a few 
clusters of amino acids are found to be conserved in all 
four proteins. A few of these, GKR (residues 131 to 133 
SEQ ID N0:12), an acidic region VDSD (153 to 156 SEQ ID 
15 N0:12), DRXLT (256 to 260 SEQ ID N0;12), QQXRW (292 to 
296 SEQ ID N0:12), and WXTR (418 to 421 SEQ ID N0:12), 
are also found in the eukaryotic polysaccharide 
synthases . 

20 Since all four proteins contain highly hydrophobic 

stretches, hydrophobic amino acids are found conserved at 
several locations throughout the proteins. Four 
hydrophobic stretches identified in Cps3S are found in 
all four proteins. These regions may span the cell 

25 membrane. This hypothesis has been supported for NodC. 

Immunogold labeling revealed a surface location for NodC, 
and the C- terminal hydrophobic region was shown to direct 
the insertion of an alkaline phosphatase fusion protein 
to the cell membrane (Johnson, et al., 1989; John, et 

30 al., 1988). Earlier studies indicated that the type 3 
capsule synthesizing activity also has a membrane 
location (Smith, et al., 1961). The last hydrophobic 
stretch may be required for the function of Cps3S since 
the insertion in JD897 which eliminated this region (the 

35 last 45 amino acids of the protein) resulted in loss of 
capsule production (FIG. 6F) . Expression of Cps3S in 
E. coli was, like that of NodC, lethal to the host. 
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A. Method 

I. Expression of Cps3S 

A 2.1 kb Sau3AI-PstI fragment containing the 3' end 
5 of cpsSD and the entire cpsSS gene was cloned from 

pJD351 into the expression vector pKK223-3 {Brosius and 
Holy, 1984) at the polylinker BatnHI-Pstl sites to yield 
pJD424. Cultures of E. coli TG-1 (Sambrook, et al., 
1989) or TG-1 transformants were grown to exponential 
10 phase, at which time isopropyl-b-D-thiogalactoside (IPTG) 
was added to a concentration of 1 mM to induce expression 
from the tac promoter of pKK223-3. 



Transformations and other DNA manipulations were 
15 performed as described in Example 1. 

B. Results 

The sequence from residues 211 to 233 in NodC was 
noted for the large number of cysteine residues. It has 

20 been suggested that this region participates in the 

binding of divalent cations which are necessary for the 
production of chitin and chitin-like molecules (Atkinson 
and Long, 1992) . Type 3 capsule synthesis requires Mg++ 
(Smith, et al • , 1960). Although this region in Cps3S 

25 contains only one cysteine, the region is highly 
conserved between all four proteins. 



The GenBank search also revealed that Cps3S has 
homology over short stretches to the rhamnosyl 

30 transferase RfbN from Salmonella enteritica, which is 
necessary for the production of 0-antigen in type B 
strains. This enzyme creates an cy(l-4) linkage to 
mannose in the 0-antigen repeat unit. The homologous 
regions are a subset of those conserved regions common to 

35 HasA, NodC, and Cps3S, but the best homology is seen in 
the region 229 to 278 (SEQ ID N0:12) , 
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In the production of Group B type III capsular 
polysaccharide, the galactosyl transferase CpsD transfers 
a galactose to a molecule located in the cell melmbrane. 
Rubens et al. (1993), suggested that the acceptor may be 
5 dolichol or a related molecule, and identified a region 
of CpsD with homology to putative dolichol binding 
regions of several proteins. Although it is not clear 
that such sequences are actually involved in dolichol 
recognition or binding (Schutzbach, et al., 1993), 

10 several similar regions (e.g., at residues 7 to 20, 21 to 
38, and 388 to 401, as numbered in SEQ ID N0:12) are 
present in CpsS. Since the putative dolichol binding 
motif [FL(F/I)VXFXXI(P/L)FXFY] (Albright, et al., 1989; 
Kelleher, et al., 1992) is a highly hydrophobic sequence 

15 that is rich in phenylalanines, the sites in Cps3S may 
actually reflect the hydrophobic ity of the molecule and 
the A-T rich bias in the DNA sequence rather than 
indicating a specificity for dolichol-like molecules. It 
is not known whether 3. pneumoniae utilizes an 

20 intermediate acceptor in capsule synthesis, however the 
capsular polysaccharides of several serotypes have been 
. found to be covalently linked to the cell wall. Type 3 
capsule, by contrast, is not covalently linked to the 
cell and is generally considered an exopolysaccharide 

25 (Sorensen, et al., 1990). Therefore, if Cps3S does use a 
membrane bound acceptor, it is likely not the final 
acceptor. 



EXAMPLE 12 

30 Cp83U is Homologous to 61ucose-l-Phosphate 

XXridylyl transferases and Cps3M is Homologous 
to Phosphomutases 



The gene downstream of cps33 is designated as cps3U 
35 (SEQ ID NO: 5) based on its probable function. The amino 
acid sequence of Cps3U (SEQ ID NO: 13) showed a high 
degree of homology with glucose -1 -phosphate 
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uridylyltransferases from several other bacterial 
species. The highest degree of homology was found with 
GtaB from Bacillus sxibtilis {55% identity, 73% 
similarity) . The active site of glucose -1 -phosphate 
5 uridylyltransferase has not been characterized from any 
of the bacterial enzymes, however, the active site in the 
enzyme from potato tuber (Solanum tuberosum) has been 
investigated. Kazuta et al., recognized 5 lysine 
residues present at the active site (Kazuta, et al., 

10 1991), and by mutational studies Katsube et al., showed 
that one of these residues was important for function, 
and a second was absolutely required (Katsube, et al., 
1991) . Cps3U contains 24 lysines, six of which are 
absolutely conserved among the six bacterial glucose- 1- 

15 phosphate uridylyltransferases in the database. Only one 
region from CpsSU containing a conserved lysine can be 
aligned well with the potato tuber enzyme sequence. It 
is homologous to the region containing the required 
lysine. 

20 

The final gene in SEQ ID NO: 5, is cpsM with a 
deduced amino acid sequence (SEQ ID NO: 14). The CpsM 
amino acid sequence revealed significant homology to both 
phosphoglucomutases (PGM) and phosphomannomutases (PMM) 

25 from a diverse group of microorganisms. Contained with 
CpsM is a phosphoserine signature sequence 
(GIMVTASHTPAPFNG) conserved within the reported active 
sites of both PGMs and PMMs. However, approximately 15% 
of the C terminus present in other phosphomutases, and 

30 apparently more important for their function, is absent 
from CpsM. Phosphomutase activity from a recombinant 
CpsM was not detected in E. coli, suggesting that cpsM 
may encode a non- functional protein. 
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EXAMPLE 13 

cr>s3S and cnsSD are Transcribed as an Operon 

A. Methods 

5 Southern blotting was performed as described in 

Example 4, all other DNA manipulations, including 
insertion deletion mutations, were performed as described 
in Example 1, The locations of mutations can be seen in 
FIG. 6E and FIG. 6F. 

10 

B. Results 

Use of fragments subcloned from the cpsSDSU region 
to direct insertion-duplication mutations in the parent 
type 3 chromosome resulted in several mutants that 

15 produced no detectable capsule (PIG. 9) and exhibited the 
extremely rough phenotype described by Taylor (1949) . 
The colonies were very small and rough, and the cells 
clumped when grown in liquid culture. DNA sequencing 
revealed that the sites of the mutations are within cpaSS 

20 (FIG. 6F) . The lack of capsule production in these 

mutants must be due to loss of cpsSS expression, rather 
than to a polar effect on downstream genes, since 
insertions within cps3U or cpsSM, the next genes 
downstream, had no apparent effect on capsule production - 

25 

Molecular and genetic evidence suggest cps3S is in 
an operon with cps3D. Sequence analysis revealed no 
potential promoter sequences in the region upstream of 
cps3S (FIG. 6E, SEQ ID N0:5) . The phenotypes of several 

30 insertion mutants also suggest that no promoter is 
located in the 3' end of cps3D and that cps3S is 
transcribed from the cps3D promoter. The sites of these 
insertions are shown in FIG. 6E, FIG. 6F and FIG. 7, 
however, the structures of the mutations are more fully 

35 illustrated in FIG, 9. To insure that the plasmids had 
inserted as expected for insertion-duplication mutations, 



wo 95/31548 



PCT/DS9S/06n9 



- 113 - 

chromosomal DNA from the mutant strains was subjected to 
Southern blot analysis. 

Insertion mutants were digested with Mscl/Fspl for 
5 JD982, Mad/ Sail for JD983, and Mscl/Kpnl for JD908, 

JD902, and JD900 and run on agarose gels and blotted as 
described in Example 4 . The blots were probed with 
vector pjy4164. Increasing distance from the ATscI site 
to the end of the vector was demonstrated by an increase 

10 in the size of the upper band. A faint band in the JD982 
lane was observed, likely a result of partial digestion. 
The 4.7 kb and 4.8 kb bands in JD982 and JD908, 
respectively, indicate that these mutants contain a 
duplication of the inserted plasmid. 

15 . 

Insertion of the plasmids results in a duplication 
of the cloned fragment. Therefore, mutant strains such 
as i7D908, in which the duplicated fragment contains both 
the 5' end of cpmSS and the 3' end of cpB3D, have a full- 

20 length copy of cpaSS downstream of the plasmid insertion. 
In addition, the full-length copy is contiguous with the 
3' end of cp83D* Therefore, if cpsSS had its own 
promoter, or if one were located in the 3' end of cps3D, 
these insertions should not result in a loss of cpsSS 

25 expression. However, four such insertions have been made 
in the WJ2 chromosome (JD846, JD897, iJD898, and JD908) , 
and even with a duplication of up to 450 bp of the 3' end 
of cpsSD, a loss of capsule production was observed. 

30 Two more internal insertions in cpsSD were created. 

As expected these insertions eliminated capsule 
production (FIG. 9) . However, since cpsSD and cpsSS are 
transcribed as an operon, this result does not prove that 
cps3D is required for capsule synthesis. That fact is 

35 demonstrated by the lack of capsule production seen in 
strains containing non-polar point mutations in cp83D 
(Example 14) . 
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EXAMPLE 14 
In Vitro Polymerization Assay 



To evaluate the competence of the mutants to 
5 synthesize type 3 capsule, an in vitro polymerization 
assay was used. 



A. Methods 

I, In vitro polysaccharide synthesis 

10 

Type 3 capsular polysaccharide was synthesized and 
quant itated in vitro using a modification of the method 
of Smith, et al., 1961. Crude extracts containing cell 
membranes and cytoplasm were prepared from 200 ml of 

15 pneumoniae cultures harvested at an q^qq of 0.25 as 
described (Yother and White, 1994), except that cell 
material was concentrated 200 -fold, and all steps were 
performed using a thioglycolate buffer (10 mM sodium 
thioglycolate, 5 mM MgSO^, 100 mM Tris-HCl pH 8.3) to 

20 stabilize the enzymes (Smith, et al., 1960). The 

digestion of cell wall material by mutanolysin treatment 
was performed in this buffer and 20% sucrose. 
Protoplasts were sonicated three times for 15 s at 35% 
power at 0**C. 

25 

Polysaccharide synthesis was carried out at 34 °C for 
2 h in a 1 ml reaction containing 100 ml of extract^ 5 mM 
UDP-glucose, 5 mM UDP- glucuronic acid (where indicated) , 
and 1 mM NAD, in the thioglycolate buffer. The reaction 
30 was boiled 1 min then quickly cooled to 25*C in H2O. 

Following centrif ugation for 30 s at 8160 x g, the type 3 
specific monoclonal antibody 16,3 (Briles, et al., 1981a) 
was added in excess to the supernatant and incubation was 
continued at 37*'C for 30 min. 



The specific antigen- antibody complexes were 
measured at 650 nm in a spectrophotometer, and the amount 
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of capsule was determined by comparison with a standard 
curve prepared using purified type 3 polysaccharide 
purchased from ATCC (Rockville, MD) (Bemheimer, 1953). 
Reactions were done in triplicate and were standardized 
5 to protein content of the crude extract, as determined in 
duplicate using the Bio-Rad Laboratories {Hercules, CA) 
protein assay kit. 



B« Results 

10 The spontaneous mutants JD611 and JD619 {cpsAl and 

cp8A2) , which contain stop mutations in cpsSD, produce no 
detectable capsular material. However, both synthesized 
high molecular weight type 3 polysaccharide in a cell- 
free system in vitro when provided with the nucleotide 

15 sugar precursors, i.e^, UDP-glucose and UDP- glucuronic 

acid (Table 7) . No capsule was produced by these mutants 
when UDP -glucuronic acid was omitted from the reaction . 
These results indicate that these mutants produce no 
capsule due to the lack of UDP -glucuronic acid and 

20 support the conclusion that Cps3D is the UDP-glucose 

dehydrogenase. They also confirm that stop mutations in 
cp83D are not polar on cps3S. The increased amount of 
polysaccharide produced by the WU2 extract (as compared 
to that produced by that of JDS 11 or. JD619) may be 

25 explained by the observation of Smith et al. (1961), that 
increased amounts of type 3 capsule are produced in vitro 
when a small amount of unpurified polysaccharide is 
already present in the reaction. 

30 The mutants which contain insertions within cp83S 

(JD902) , or between the full-length copies of cpaSD and 
cpsSS (JD908, JD897) were unable to synthesize 
significant amounts of capsule even with both precursors 
present. These results emphasize the role of Cps3S in 

35 capsule synthesis and support the conclusion that cps3D 
and cps3S are transcribed as an operon. 
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The capsule-deficient mutants JD614 and JD692 
synthesized only small amounts of additional 
polysaccharide in the in vitro assay. This result is 
somewhat surprising since JD692, which was shown to 
5 contain a missense mutation within the cps3D coding 

region, should still make a functional Cps3S (i.e., the 
cps3D mutation must not be polar since the intact cells 
are able to synthesize some polysaccharide) . The result 
may suggest that the defective UDP-glucose dehydrogenase 
10 in some way interferes with the ability to synthesize the 
normal polysaccharide. Alternatively, the stability of 
the cpaDS transcript may be altered by the mutation, 
resulting in a reduced amount of CpsS. 
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Table 7. Is vitro capsule synth sis assay* 



5 







UDPGAb 


CPS ( ua/ma 
protein) 


JD611 


Cps3D"S+ 


+ 


9.8 ±0.6 






- 


0,9 ±0.2 


JD619 


Cps3D"S+ 


+ 


5.7 ±0.3 






- 


0.2 ±0.1 


JD614 


Cps3D*S* 


NA^ 


5.4 ±0.4 (to)C 






+ 


5.9 ±0.5 (0.5)° 


JD692 


Cps3D*S* 


NA 


4.8 ±0.3 (tp) 






+ 


7.0 ± 1.0 (2.2) 


JD902 


Cps3D+S" 


+ 


1.7 ±0.3 


JD908 




+ 


1.5 ±0.1 


JD897 


Cps3D+S" 


+ 


1.1 ±0.1 


WU2 


CpsSD+S'** 


NA 


3.8 ±0.2 (to) 






+ 


16,6 ±0,3 (12.8) 








16.3 ±0.8 


D39 


Cps2+ 


+ 


0.5 ±0.3 



^ Capsule phenotypes are based on the cpaSD and cpa3S 
genotypes. - indicates either a stop or insertion 
mutation (see FIG. 6E, FIG. 6F and FIG. 9 for 
25 locations of mutations) . *indicates either a 

missense or in- frame deletion or insertion in cpa3D 
that apparently also affects cp83S. 
^ NA, not applicable. 

° For strains which produce capsule in vivo, the 
30 amount of polysaccharide present at the start of the 

assay (tQ) is given, and the amount of 
polysaccharide produced during the assay is 
indicated in parentheses. 



35 
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EXAMPLE 15 
plochemlcal Pathway 

Based on the genetic analysis, the homology of the 
5 amino acid sequences. of the type-specific genes to the 
sequences of enzymes of known function, the behavior of 
the mutants in biochemical and immunochemical assays, and 
previous biochemical characterizations of type 3 strains 
(Austrian, et al., 1959; Dillard and Yother, 1994; Smith, 

10 et al., 1960; Smith, et al., 1961; Bernheimer, 1953), a 
pathway for the biosynthesis of type 3 capsular 
polysaccharide is proposed (FIG. 10). The last of the 
type -specif ic genes, cpsBM, is homologous with 
phospxhoglucomutases from several bacterial species. Even 

15 though maintained in the type 3 -specific region, Cps3U 
and Cps3M may not be required for capsule synthesis, 
since an insertion internal to cp83U (which has a polar 
effect on cps3M) does not result in loss of capsule 
production (FIG. 7 and FIG. 9), as judged by colony 

20 morphology on blood agar medium. 

EXAMPLE 16 
The Downstream Non Type-specific 
Planking Region and Mapping Other Capsule Types 

25 

Southern blots of digested chromosomal DNA from 
strains 2, 3 and 6B and probed with pJD377 was performed 
as described in Example 4. Faint bands in addition to 
the band of interest was observed on Southern blots. 
30 This was probably due to the detection of fragments 
containing the amiA-like genes which have homology to 
plpA. DNA was either digested with Bglll, Sad of Hind 
III. Other laboratory techniques were as described in 
Example 1 or Example 8. 

35 

Sequence analysis of the 1.2 Kb Sacl-Hindlll 
fragment (from plasmid JD377) employed in Example .5 
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contained the 3' end of cpsBM and the 5' half of a gene 
with 50% identity to the S. pnevmoniae amiA (SEQ ID 
NO: 6). The ainiii-like sequence has recently also been 
identified by Pearce et al, and named expl (Pearce, et 
5 al., 1993), and subsequently renamed plpA (Pearce, et 

al., 1994). Further Southern hybridizations performed as 
described in Example 4 showed that the non- type- specific 
homologous DNA in the 1.2 kb Sacl-Hindlll fragment is 
plpA. 

10 

A partial copy of a transposase gene was also 
identified immediately adjacent to and between cpsM and 
pIpA. Previous findings of repetitive elements linked to 
the capsule locus suggest that the deletions in this 
15 region may be the result of a transposition event, 
possibly one which introduced the type 3 -specific 
cassette. 

If, as in type 3, the homologous region is directly 

20 adjacent to the type-specific genes in other serotypes, 
it should be possible to map other type-specific genes 
using this fragment. This was found to be the case, and 
the chromosome maps of the capsule regions in strains of 
types 2, 3, and 6B, from Southern blots, are shown in 

25 PIG. 11. It can be seen in FIG. 11 that restriction 
sites located to the right of the pIpA fragment are 
highly conserved in all three strains. The type 3 strain 
differs slightly in this region due to a deletion of the 
-5' end of pipA. The sites located to the left of pipA 

30 are divergent among the capsule types. The close linkage 
of the region to all the necessary type -specific genes 
for each type, combined with the different restriction 
maps and the fact that the type 3 -specific genes are 
located directly adjacent to this fragment, suggests that 

35. this region contains the type -specific genes in all three 
capsule types. 
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EXAMPLE 17 

The Upstream Non Type" specific Planking Region 

In order to isolate DNA 5' of the biosynthetic 
5 genes, a 1.8 kb fragment extending from the upstream Sad 
sites to just before the PvuII site in cps3D (nucleotide 
1. through 1802 of FIG. 6D and FIG. 6E, nucleotide 1 
through 934 of SEQ ID NO: 4 and nucleotide 1 through 868 
of SEQ ID NO: 5) was amplified from the type 3 WU2 
10 chromosome using inverse PGR as described in Example 8. 
All other materials and methods were as described in 
Example 1, 4 and 8. 

The 1.8 kb fragment was then used to probe Hindlll- 

15 digested chromosomal DNA from seven pneumoniae 
serotypes (2, 2, S, G, 8, 9 and 22). The fragment 
hybridi2ed strongly with the expected fragments at 2.2 
and 2.3 kb in the type 3 strain. However, hybridization 
was also observed with fragments of 2.6 and 8 kb, along 

20 with weak hybridization with several other fragments 

(3.0, 3.1, and 4.4 kb) . Likewise, each of the strains 
representing other capsule types contained two strongly 
homologous fragments (4.8 and 8.0 kb for types 2, 6B, 8, 
9; 2.2 and 4.8 or 12 for types 5 and 22, respectively) 

25 and at least one weakly homologous fragment (4.4 kb) . 

When chromosomal DNAs of types 2, 3, and 6B were digested 
with PstI, PvuII, or Sad / Hindlll , and probed with the 
604 bp Sacl-Hindlll fragment (pJD392) upstream of cps3D 
'{within nucleotides 1 through 610; SEQ ID N0:4, FIG. 6D) , 

30 4 to 10 bands were detected in each. 

Transformation studies were performed to examine 
linkage of the repeat upstream region to the type- 
specific capsule genes. The plasmid (pJD392) containing 
35 the 604 bp Sacl-Hindlll fragment was introduced* into the 
chromosome of the type 3 strain. The insert, located in 
the 2.2 kb Hindlll fragment (FIG. 7) adjacent to the type 
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TABLE 8. BAimS DETECTED BY SOUTHERN BLOTTING 
Stra&BStrlctlon Enzyme used to digest chromosomal DNA. 

■ BgJII Sad Sphl 

type 2 >12* 4.0,>12 12,>12 

type 3 >12* 1.4, (weak at 10 (weak at 

3.5,8.5, >12) 12,5,13) 

type 6B >12* 4.0,12 12,>12 



* The Bglll fragments were not identical in size. 
10 Numbers represent size of bands in kb as observed on 
southern blots carried out as described in Example 4. 

DNA in the 1.4 kb SacI fragment was sequenced using 
techniques as described in previous Examples, and can be 
15 seen, alongside the predicted amino acid sequences; in 
FIG. 6A, FIG. 6B and FIG. 6C (SEQ ID N0:1, SEQ ID N0:2, 
SEQ ID N0:3, SEQ ID N0:7, SEQ ID N0:6, SEQ ID N0:9 and 
SEQ ID NO:10) . 

20 EXAMPLE 18 

Capsule Type Expression and Virulence in g. pneumoniae 

In these studies, isogenic strains expressing the 
type 3 capsule were constructed and the effect on 

25 virulence was determined. Strains of types 2, 5, and 6B 
were used as recipients. The type 2 and 5 strains differ 
in virulence from the type 3 strain in terms of time 
required to cause death (shorter with type 2) and LD^q 
(lower with type 5) . The type 6B strain is of low 

30 virulence in mice. The results showed that expression of 
the type 3 capsule attenuated the virulence of the type 5 
strain, caused the type 6B strain to become highly 
virulent, and had no effect on the type 2 strain. Thus, 
in general, the expression of virulence was correlated 

35 with the type of capsule expressed. 
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A. M thods 

1. Trans formationa , serotyping, ELISAa and restriction 
enzyme fragment patterns. 

Transformations ELISAs and DNA manipulations were 
5 performed as described previously in Example 1. All 
transformants and parental strains were serotyped with 
capsule type-specific antisera (Statens Seruminstitut , 
Copenhagen, Denmark) in slide agglutination assays. 
Genomic DNA, was digested with HindlU for 4 h at 37<*C 
10 and electrophoresed overnight through 0-7% agarose in 
Tris-borate-EDTA buffer. 

2, Analysis of PspA. 

Bacteria were grown in CDM containing 2% choline, a 
15 condition that causes release of PspA into the culture 

medium. Filtered, unconcentrated supernatant fluids (20 
/xl) were electrophoresed in sodium dodecyl sulfate (SDS)- 
12% polyacrylamide gels- Western blotting (immuno- 
blotting) was performed by using a semidry electroblotter 
20 {Bio-Rad Laboratories, Richmond, Calif.), and the blots 
were processed as described. previously (Yother et al., 
1992) . The PspA- specif ic monoclonal antibodies XiR278, 
Xil26, and 2A4 were kindly provided by Larry McDaniel 
(University of Alabama at Birmingham) . Silver staining 
25 was performed by using the Silver Stain Kit from 

Stratagene Cloning Systems, Inc. (La Jolla, Calif.). 

3- Characterization of morphology and capsule production. 
For average chain length determinations, bacteria 

30 were grown in THY to an optical density at 600 nm (OD^^q) 
of -0.3. Chain lengths were determined microscopically 
by using a Petrof f -Hauser counting chamber (Auther C. 
Thomas Co., Philadelphia, PA). An average of five 
squares was counted for each strain. Comparisons of 

35 average chain lengths were determined by using the two- 
sample rank test (Zar, 1984) . 
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The number of cells per colony was determined by 
using bacteria grown on blood agar medium for 18 h at 
21 in 5% COj- A plug containing a single colony was 
obtained with a sterile Pasteur pipette and then 
5 resuspended in 50 /il of THY. Tenfold serial dilutions 
were perfonned in THY and plated on blood agar medium. 
Plates were incubated overnight at 37 ®C in 5% CO2/ and 
the number of CFU per colony was calculated. 

Buoyant density determinations were performed by 
using bacteria grown on blood agar medium or in THY. 
Bacteria grown on solid medium were harvested by washing 
each plate with water, centrifuging the suspension, and 
then resuspending the pellet to an OD^qq of -0.4 with 
water. Ten«-milliliter liquid cultures, grown to an OD^qq 
of -0.5, were harvested by centrifugation for 10 to 15 
min at 8,000 to 16,000 x g. Bacteria were washed twice 
with water prior to being loaded onto 10 -ml, continuous, 
0 to 50% Percoll (Pharmacia, Piscataway, N.J.) gradients. 
As stemdards, 5 fil of density marker beads (Pharmacia) 
ranging in size from 1.033 to 1.076 g/ml, were also 
loaded. Gradients were centrifuged for 30 min at 8,000 x 
g with the brake off. A standard curve based on the 
migration of the marker beads was generated, and the 
density of the bacteria was determined by extrapolation. 

For determinations of total capsule content, 1.5 -ml 
cultures grown in CDM containing 0.0005% choline were 
harvested by centrifugation at 8,000 to 16,000 x g for 10 
30 min. Culture supernatant fluids were filtered and saved, 
and the cells were resuspended in 500 /xl of protoplast 
buffer (20% sucrose, 0.005 M Tris [pH 7.4], 0.0025 M 
MgS04) . Cell sonicates were produced by three cycles of 
a 10-s pulse, followed by a 10-s incubation on ice, with 
35 a Fisher Sonic Dismembrator model 300 (Fisher 

Biotechnology, St. Louis, Mo.) with the intensity control 
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set at 30. Culture supernatant fluids and cell sonicates 
were stored at -20®C. 

For surface localization assays, l/s-ml cultures 
5 grown to an OD^qq of -0.5 were heat killed by incubation 
at 65^ C for 30 min. Bacteria were harvested by 
centrifugation, and culture supernatant fluids were 
filtered and saved. After the pellets were washed twice 
with phosphate-buffered saline (PBS; 137 nM NaCl, 2.7 tnM 
10 KCl, 4.3 mM NajHPO^ • IH2O, 1.4 mM KH2PO4) , the pellets 
were resuspended in 1.5 ml of THY. Samples were stored 
at 4<>C. 

4. Virulence assays. 

15 The virulence of the type 3 derivatives was compared 

with that of the parental strains in BALB/ByJ female mice 
(Jackson Laboratory, Bar Harbor, Maine) . Bacteria were 
grown to the mid- log phase in THY. Samples were diluted 
serially in sterile lactated Ringer's solution, and 0.2 

20 ml was used to infect mice intraperitoneally (i.p.) or 

intravenously (i.v.), as indicated. Fifty percent lethal 
doses (LD^qs) were deteimined by the method of Reed and 
Muench (1938) and compared by Fisher's exact test (Zar 
1984) . Median times to death were analyzed by using the 

25 two-sample rank test (Zar 1984). The P values were 
determined by using a two-tailed table. 

B • Results 

As described in Table 3, strain JD770 contains a 
30 non-destructive insertion in the type 3 capsule locus. 
The amount and cellular localization of the capsular 
material produced by JD770 is identical to that of its 
parent strain WU2 . Transformation of JD770 chromosomal 
DNA, and selection for erythromycin resistance, results 
35 in isolates that express the type 3 capsule of the donor 
but not the capsule of the recipient strain (Example 6) . 
Based on this result, JD770 DNA was used to transform 
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type 2, 5, and 6B recipients and selected for 
erythromycin-reslstant isolates (see Table 9) . All of 
the type 2 Ery^ transformants expressed the type 3 
capsule but not the type 2 capsule, >95% of the type 5 
5 and type 6B Ery^ transformants expressed the type 3 

capsule but not the capsule of the recipient parent. The 
remainder of the type 5 and 6B transformants expressed 
the capsular type of the recipient parent only. 
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To produce essentially isogenic strains, two 
independent transformants from each cross were 
baciccrossed at least three times to their respective 
parent recipient strains. The final isolates were 
5 examined for restriction enzyme fragment patterns, 

pi^ieumococcal surface protein A (PspA) expression, capsule 
expression, and morphological characteristics prior to 
testing in a mouse virulence model. 

10 1. Restriction enzyme fragment patterns. 

The Hindlll restriction patterns of the strains used 
in these studies can be easily distinguished. In all 
cases, the type 3 derivatives constructed here were found 
to have the Hindlll pattern of the recipient strain, 

15 indicating that gross alterations in the genomic DNA 

content had not occurred and that the parent donor strain 
JD770 had not been inadvertently re-isolated. 

2. PspA expression. 

20 PspA varies with respect to molecular weight, 

antigenic determinants, and strain distribution. PspA 
serotypes and capsular serotypes do not correlate. The 
strains used in these studies expressed PspAs that had 
different molecular weights and reacted with different 

25 PspA-specif ic monoclonal antibodies- In all cases, the 
PspAs of the type 3 derivatives constructed here were 
fo\ind to have the molecular weight and antibody 
reactivities of the parent recipient strains. 

30 3. Morphologic characterization and capsule production. 

Microscopic examination revealed that alteration of 
capsular type had no effect on the chain length of the 
type 2 and type 6B derivative strains. However, the 
chain lengths of the type 5 derivatives differed 

35 significantly from that of the type 5 parental strain and 
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were almost identical to that of the type 3 parent 
{Table 9) . 

Morphologically, type 3 strains exhibit large mucoid 
5 capsules when grown on blood agar plates, whereas type 2, 
5/. and 6B strains have small mucoid capsules. The type 3 
derivatives of the type 2, 5, and SB strains had a 
similar appearance to the type 3 parent on blood agar 
plates. The increase in colony size compared with that 

10 of the recipient parents did not appear to be due to cell 
number since similar numbers of cells per colony were 
observed for all of the parent and derivative type 3 
strains (data not shown) . To examine capsule production, 
Percoll density gradients and ELISAs were performed. 

15 Percoll density gradient cent rifugat ion has been shown 
previously to differentiate capsular serotypes and 
amounts by density (Briles et al., 1992). In this assay, 
all of the derivatives had densities similar to that of 
the parent type 3 strain and distinct from that of the 

20 recipient parent strains (FIG. 12) . Thus, all of the 

derivatives produced cell -associated, surface-localized 
type 3 capsule in amounts similar to that of the type 3 
parent. The total amoiints of capsule material produced, 
i.e., both cell associated and released, were determined 

25 in ELISAs to be similar for both the type 3 parent and 

each of the derivatives (FIG. 18) , ELISAs were also used 
to confirm that the amounts of surface -accessible capsule 
were similar in the type 3 parent and the derivatives. 

3 0 4. Virulence of type 3 derivatives. 

To assess the effect of alteration of capsule type 

on virulence, BALB/ByJ female mice were infected i.p. or 

i.v. with the type 3 derivatives and parent strains. 

Strain JD770, which contains the nondestructive 
35 erythromycin resistance marker in the type 3 capsule 

locus, did not differ from its parent type 3 strain WU2 
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in terms of median time to death (52 versus 49.5 h, i.p.) 
or LDgQ (75 versus 50 CFU, i.p.; 1 x 10^ versus 2 x 10^ 
CFU, i.v.). Thus JD770 was used in subsequent studies 
for comparisons with the type 3 derivatives. As 
5 expected, the recipient parent strains were significantly 
different from JD770 with respect to time to death or 
LD5QS (FIG. 14) . Expression of the type 3 capsule had no 
apparent effect on the virulence of the type 2 recipient 
strain; i.e., the time required to cause death was not 

10 significantly different from that of the type 2 parent 
but was significantly different from that of the type 3 
parent (FIG. 14A) . However, alteration of capsular type 
had dramatic effects on the virulence of the type 5 and 
6B strains. In contrast to the highly virulent type 5 

15 parental strain (LD^q, -10 CFU) and the virulent type 3 

parental strain (LD^q, -10^), the type 3 derivatives were 
not virulent even at doses of 10^ CFU (FIG. 14B) . 
Switching of the type 6B capsule to type 3 resulted in a 
reduction of the LD^q from >1 x 10^ to -6 x 10-^ CFU, a 

20 value that was similar to but still greater than the 7.5 
X 10^ value observed for the type 3 parent strain (FIG. 
14C) . 

These results may be indicative of the role other 
25 factors play in pneumococcal virulence. For example, the 
type 5 capsule may represent one that results in high 
virulence with few other factors required, whereas the 
type 3 capsule may require the presence of other factors 
to be highly virulent. The introduction of the type 3 
30 capsule into the type 5 genetic background may thus 

result in the expression of a virulent capsule but, in 
the absence of other necessary factors, in an avirulent 
strain. The increase in virulence of the type 6B strain 
suggests that the type 3 capsule is probably more 
35 virulent than the type 6B. However, its failure to 

become as virulent as the type 3 parent is suggestive of 
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a lack of other virulence factors in the 6B background. 
The type 2 recipient was only slightly more virulent than 
the type 3 donor and no significant change was noted in 
the virulence of its derivatives (FIG. 12) • This result 
5 may suggest that the type 2 and type 3 capsules are of 
equal virulence and that the "accessory factors" 
necessary for full virulence are present in both strains. 

Whether the decrease in virulence of type 5 
derivatives is related to the alteration of cell chain 
length is not known. Clearly, the parent type 3 strain 
is highly virulent with a similar chain length. The 
alteration in chain length may reflect a general change 
in the surface structure of the type 5 strains possibly 
resulting from the change in capsule e3q>ression. Because 
the strains constructed were transformed with chromosomal 
DNA the inventors cannot rule out the possibility that 
determinants closely linked to the capsule locus are 
affecting the outcome of these studies. However, because 
several backcrosses were performed, and because 
independent isolates exhibited identical characteristics, 
it is unlikely that unlinked determinants are responsible 
for the results. 

25 EXAMPLE 19 

Increased Virulence of S, pneuznoniae 
type SB bv Inactlvatlon of pIdA. 

In Example 18, the introduction of the type 3- 
30 specific cassette and linked genes into an avirulent type 
6B strain resulted in expression of the type 3 capsule 
and an increase in virulence. To more clearly define the 
contribution of the capsular serotype to the virulence of 
S. pneumoniae, insertion-duplication mutagenesis was used 
35 to insert an erythromycin marker adjacent to the type 6B- 
specific capsule cassette in the 3' flanking region. 



10 • 



15 
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Surprisingly, introduction of this insertion resulted in 
an increase in virulence of the type 6B strain 
(derivative LDgQ of 10^ versus parental LD^q of >10^, 
intraperitoneal) . This enhancement of virulence could be 
5 attributed to the 1.2 kb Sacl-Hindlll fragment from the 
type 3 strain WU2 (pJD377) that was used to direct the 
erythromycin marker into the type 6B chromosome. 
Transformation of the wild type 6B strain with the Sacl- 
Hindlll fragment alone, followed by intraperitoneal 

10 infection of the transformation mixture into mice 

resulted in death in less than 24 hours. Identical 
results were obtained using multiple smaller fragments of 
the Sacl-Jfindlll fragment. JD377 (Example 5) comprises 
the 3' end of cpa3M and part of the gene plpA, The 

15 fragments used contained mutations that, like the - 

original insertion, resulted in a defective plpA in the 
type 6B strain. These data suggest that avirulence of 
type 6B obsearved via the intraperitoneal route is due to 
expression of plpA, and that the increase in virulence of 

20 the type 6B strain expressing the type 3 capsule is the 
result of inactivation of the linked pIpA and not 
expression of the type 3 capsule. 

* * * 

25 

All of the compositions and methods disclosed and 
claimed herein can be made and executed without \indue 
experimentation in light of the present disclosure. 
While the compositions and methods of this invention have 

30 been described in terms of preferred embodiments, it will 
be apparent to those of skill in the art that variations 
may be applied to the composition, methods and in the 
steps or in the sequence of steps of the method described 
herein without departing from the concept, spirit and 

35 scope of the invention. More specifically, it will be 
apparent that certain agents which are both chemically 
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and physiologically related may be substituted for the 
agents described herein while the same or similar results 
would be achieved. All such similar substitutes and 
modifications apparent to those skilled in the art are 
5 deemed to be within the spirit, scope and concept of the 
invention as defined by the appended claims. 
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(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
10 (A) DESCRIPTION: /desc =' "DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 

TCATTTGATA TGCCTCCG 

15 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

30 GTGAGATAAA TAGTAGTGCG 



(2) INFORMATION FOR SEQ ID NO: 20: 



35 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 



10 TCCAGCTCGT GTCATAATCT 



20 
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CLAIMS 



1. A nucleic acid segment less than about 10 kb in 

5 length that comprises a non-type specific 3. pneumoniae 
cps gene flanking region of sufficient length to allow 
hybridization under standard hybridization conditions to 
a 3. pneumoniae cps gene flanking region. 

0 

2. The nucleic acid segment of claim 1, wherein the 
segment comprises a non-type specific S. pneumoniae cps 
gene 5' flanking region. 



3 . The nucleic acid segment of claim 2 , wherein the 
segment includes a non-type specific 3. pneumoniae cps 
gene 5' flanking region encoding for a peptide comprising 
SEQ ID N0:7, SEQ ID N0:8, SEQ ID N0:9 or SEQ ID NO:10. 

20 



4. The nucleic acid segment of claim 3, wherein the 
segment includes a non-type specific S. pneumoniae cps 
gene 5' flanking region encoding for a peptide comprising 
25 SEQ ID NO: 7, 



^5. The nucleic acid segment of claim 3, wherein the 
segment includes a non-type specific 3. pneumoniae cps 
30 gene 5' flanking region encoding for a peptide comprising 
SEQ ID NO: 8. 



6 • The nucleic 
35 segment includes 



acid segment of claim 3, wherein the 
a non-type specific 3. pneumoniae cps 
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gene 5' flanking region encoding for a peptide comprising 
SEQ ID NO: 9. 

7. The nucleic acid segment of claim 3, wherein the 
5 segment includes a non-type specific S. pneumoniae cps 
gene 5' flanking region encoding for a peptide comprising 
SEQ ID NO:10. 



10 8. The nucleic acid segment of claim 2/ wherein the 

segment comprises a non-type specific S. pneumoniae cps 
gene 5' flanking region having a sequence that 
corresponds to at least a 60 nucleotide long contiguous 
stretch of SEQ ID N0:4. 



9. The nucleic acid segment of claim 8, wherein the 
segment comprises a non-type specific S. pnexmoniae cps 
gene 5' flauiking region having a sequence that 
20 corresponds to at least a 100 nucleotide long contiguous 
stretch of SEQ ID N0:4. 



10. The nucleic acid segment of claim 9, wherein the 
25 segment comprises a non-type specific S. pnexmoniae cps 
gene 5' flanking region having a sequence that 
corresponds to at least a 500 nucleotide long contiguous 
stretch of SEQ ID N0:4. 



11. The nucleic acid segment of claim 10/ wherein the 
segment comprises a non-type specific 5. pneumoniae cps 
gene 5' flanking region having a sequence that 
corresponds to SEQ ID NO: 4. 



35 
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12. The nucleic acid segment of claim 2, wherein the 
segment comprises a non-type specific 5. pneimioniae cps 
gene 3' flanking region. 

5 

13^ The nucleic acid segment of claim 12, wherein the 
segment comprises a non-type specific S. pneumoniae cps 
gene 3' flanking region having a sequence that 
corresponds to at least a 15 nucleotide long contiguous 
10 ' stretch of SEQ ID NO: 6. 

14. The nucleic acid segment of claim 13, wherein the 
segment comprises a non-type specific S. pneumoniae cpa 
15 gene 3' flanking region having a sequence that 

corresponds to at least a 30 nucleotide long contiguous 
stretch of SEQ ID NO: 6. 

20 15. The nucleic acid segment of claim 14, wherein the 
segment comprises a non-type specific S. pneumoniae cps 
gene 3' flanking region having a sequence that 
corresponds to at least a 60 nucleotide long contiguous 
stretch of SEQ ID N0:4. 

25 

16. The nucleic acid segment of claim 15, wherein the 
segment comprises a non-type specific 3. pneumoniae cps 
gene 3' flanking region having a sequence that 

30 corresponds to at least a 100 nucleotide long contiguous 
stretch of SEQ ID NO: 6. 

17. The nucleic acid segment of claim 16, wherein the 
35 segment comprises a non-type specific S. pneumoniae cps 

gene 3' flanking region having a sequence that 
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corresponds to at least a 500 nucleotide long contiguous 
stretch of SEQ ID NO: 6. 

5 18. The nucleic acid segment of claim 17, wherein the 
segment comprises a non-type specific S. pneumoniae cps 
gene 3' flanking region having a sequence that 
corresponds to SEQ ID NO: 6. 

19. The nucleic acid segment of claim 1, wherein the 
segment comprises a non-type specific S. pneumoniae cps 
gene 5' flanking region and a non-type specific 

pneumoniae cps gene 3' flanking region. 

20. The nucleic acid segment of claim 19, wherein the 
segment comprises a 5' flanking region that encodes for a 
peptide comprising SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9 

20 or SEQ ID NO: 10 oind a 3' flanking region sequence that 
corresponds to at least a 30 nucleotide long contiguous 
stretch of SEQ ID NO: 6. 



10 



15 



25 21. The nucleic acid segment of claim 20, wherein the 

segment comprises a 5' flanking region that encodes for a 
peptide comprising SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9 
br SEQ ID NO: 10 and a 3' flanking region sequence that 
corresponds to at least a 100 nucleotide long contiguous 

30 stretch of SEQ ID NO: 6. 



22. The nucleic acid segment of claim 21, wherein the 
segment comprises a 5' flanking region that encodes for a 
35 peptide comprising SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9 
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or SEQ ID NO: 10 and a 3' flanking region sequence that 
corresponds to SEQ ID N0:6- 



5 23. The nucleic acid segment of claim 19, wherein the 
segment comprises a 5' flanking region sequence that 
corresponds to at least a 60 nucleotide long contiguous 
stretch of SEQ ID NO: 4 and a 3' flanking region sequence 
that corresponds to at least a 30 nucleotide long 
10 contiguous stretch of SEQ ID NO: 6, 



24. The nucleic acid segment of claim 23, wherein the 
segment comprises a 5' flanking region sequence that 

15 corresponds to at least a 100 nucleotide long contiguous 
stretch of SEQ ID NO: 4 and a 3' flanking region sequence 
that corresponds to at least a 100 nucleotide long 
contiguous stretch of SEQ ID NO: 6, 

20 

25. The nucleic acid segment of claim 24, wherein the 
segment comprises a 5' flanking region sequence that 
corresponds to SEQ ID NO: 4 and a 3' flanking region 
sequence that corresponds to SEQ ID NO : 6 . 

25 

26. The nucleic acid segment of claim 1, further defined 
as including a type specific S. pneumoniae cps gene 
region of sufficient length to allow hybridization to a 

30 3. pneumoniae cps gene region under standard 
hybridization conditions. 

27. The nucleic acid segment of claim 26, further 

35 defined as less than about 5,000 nucleotides in length. 
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28. The nucleic acid segment of claim 21, further 
defined as less than about 1,000 nucleotides in length, 

5 29. The nucleic acid segment of claim 1, further defined 
as a recorabinamt vector. 

30. A nucleic acid cassette less than about 20 kb in 
10 length that comprises a non-type specific S, pneumoniae 

cps gene 5' flanking region sequence and a non-type 
specific S. pneumoniae cps gene 3' flanking region 
sequence, the flanking region sequences being of 
sufficient length to allow hybridization under standard 
15 hybridization conditions to a. 3. pneumoniae cps gene 
flanking region. 

31. A nucleic acid segment of up to about 20 kb in 
20 length, comprising a S. pneuinoniae cps gene region of 

sufficient length to allow hybridization to a 
S. pneumoniae cps gene region imder standard 
hybridization conditions . 

25 

32. The nucleic acid segment of claim 31, further 
defined as comprising a cpsB gene. 

30 33. The nucleic acid segment of claim 31, further 
defined as comprising a cpsC gene. 

. 34. The nucleic acid segment of claim 31, further 
35 defined as comprising a cpsE gene. 
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35. The nucleic acid segment of claim 31, further 
defined as comprising a cpsD gene. 

5 

36. The nucleic acid segment of claim 31, further 
defined as comprising a cpsS gene. 

10 37. The nucleic acid segment of claim' 31, further 
defined as comprising a cpsU gene. 

38. The nucleic acid segment of claim 31, further 
15 defined as comprising a cpsM gene. 

39. The nucleic acid segment of claim 31, further 
defined as comprising a 'plpA gene. 

20 

40. The nucleic acid segment of claim 31, further 
defined as comprising a tnpA gene. 

25 

41. The nucleic acid segment of claim 31, further 
defined as comprising a complete pneumoniae cps gene 
region. 

30 

42. The nucleic acid segment of claim 31, wherein the 
3. pneumoniae cps gene region is defined as a Type 3 cps 
gene region. 



35 
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43. The nucleic acid segment of claim 42, 
defined as comprising a Type 3 cpsB, cpsC, 
cpsS, cpsU, cpsM, tnpA and 'plpA gene. 



further 



cpsE, cpsD, 



5 



44;. The nucleic acid segment of claim 31, further 
defined as comprising a cps gene flanking region, wherein 
the flanking region corresponds to any nucleic acid 
segment in accordance with the foregoing claims 1 through 
10 25. 

45. The nucleic acid segment of claim 44, further 
defined as less than about 10,000 nucleotides in length. 

15 

46. The nucleic acid segment of claim 45, further 
defined as less than about 5,000 nucleotides in length. 

20 

47. The nucleic acid segment of claim 44, further 
defined as a DNA cassette bounded at each terminus by a 
PGR primer of known sequence or a restriction enzyme 
recognition site. 



48. The nucleic acid segment of claim 47, wherein the 
segment is bounded by an SphI or Sail site. 



49. The nucleic acid segment of claim 44, further 
defined as a recombinant vector. 



35 



50. The nucleic acid segment of claim 49, further 
defined as recombinant vector comprising at least one 
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S. pneumoniae cpe gene and sufficient flanking region to 
allow homologous recombination of the fragment in a 
S. pneimioniae host cell. 

5 

51. The nucleic acid segment of claim 50, further 
defined as comprising a complete S. pneumoniae cpa gene 
region . 

10 

52. A recombinant host cell comprising a recombinant 
vector comprising a nucleic acid segment in accordance 
with claim 44. 

15 

53. The recombinant host cell of claim 52, further 
defined as a recombinant E. coli host cell- 

20 54. The recombinant host cell of claim 52, further 
defined as a recombinant gram positive host cell . 

55. The recombinant host cell of claim 54, further 

25 defined as a Bacillus, Staphylococcus, or Streptococcus 
host cell. 

56. The recombinant host cell of claim 55, further 
30 defined as a recombinant S. pneumoniae host cell, 

57. A recombinant host cell in accordance with claim 52, 
further defined as including an engineered resistance 

35 gene. 
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58. A recombinant 3. pneumoniae cell of a selected 
serotype, the cell expressing a cpe gene of another 
pneumoniae serotype. 



5?. The recombinant S. pneumoniae cell of claim 58, 
expressing a cpsB, cpsC, cpsE, cpsD, cpsS, cpsU, cpsM, 
plpA or tnpA gene. 



60- A method for preparing a recombinant host cell, 
comprising preparing a S. pneumoniae cps gene and 
transforming a host cell with said gene. 



61. The method of claim 60, wherein the host cell is 
defined as a S. pneumoniae host cell, and the cps gene is 
introduced by a method comprising the steps of: 



20 (a) preparing a DNA segment that includes a 

selected S. pneumoniae cps gene flanked by 
sufficient S. pneumoniae flanking regions 
to allow homologous recombination in the 
S. pneumoniae host; 

25 

(b) transforming the S. pneumoniae host with 
the DNA segment; and 

(c) selecting a recombinant host that 

30 expresses the S. pneumoniae cps gene. 



62. The method of claim 61, wherein the DNA segment is a 
plasmid. 
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63. The method of claim 61, wherein the host, prior to 
transformation, is a high producer of the capsular 
polysaccharides . 

5 

64.. The method of claim 63, wherein corresponding cps 
gene of the host has been replaced by homologous 
recombination with the recombinsmt cps gene. 

10 

65. The method of claim 61, wherein the cell is selected 
by means of a resistance gene. 

15 66. The method of claim 65, wherein the resistance gene 
is positioned in the non type specific cps region. 



67. The method of claim 66, wherein the resistance gene 
20 is an erythromycin resistance gene. 



68. A method for detecting S. pneumoniae in a sample, 
comprising the steps of: 

25 

(a) obtaining nucleic acids from a sample 
suspected of containing S. pneumoniae; 

(b) subjecting said nucleic acids to 

30 hybridization with a S. pneumoniae cps 

nucleic acid segment comprising a cps gene 
flanking region or a cps gene coding 
region of sufficient length to allow 
hybridization to 5. pneumoniae cps nucleic 

35 acids under standard hybridization 

conditions; and 
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(c) detecting the hybridized nucleic acids. 

69. The method of claim 68, wherein said S. pnexmoniae 
5 cps nucleic acid segment comprises a non-type specific 

S. pneumoniae cps gene flanking region of sufficient 
length to allow hybridization under standard 
hybridization conditions to a S. pneumoniae cps gene 
flanking region. 

10 

70. The method of claim 68, wherein the nucleic acids 
from said sample are subjected to restriction enzyme 
digestion and size separation prior to hybridization with 

15 said S. pneumoniae cps nucleic acid segment, 

71. The method of claim 70, wherein the nucleic acids 
are subjected to Sphl digestion. 

20 

72. The method of claim 68, wherein said detection of 
hybridized nucleic acid involves PGR, 

25 

73 . A method for determining the capsule type of an 
unknown S. pnexmtoniae strain, comprising obtaining 

- nucleic acids from the strain and hybridizing said 
nucleic acids with a S. pneumoniae cps DNA segment 
30 comprising either: 

(a) a non-type specific S. pneumoniae cps gene 
flanking region of sufficient length to 
allow hybridization under standard 
35 hybridization conditions to a 
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S. pneumoniae cps gene flanking region; or 

(b) a type specific S. pneumoniae cps gene 
5 region of sufficient length to allow 

hybridization to a pneumoniae cps gene 
under standard hybridization conditions. 

10 74. A method of generating an antibody response/ 

comprising administering to an animal an immunologically 
effective amount of a Cps peptide or protein. 

15 75. The method of claim 74, wherein the Cps peptide or 
protein is encoded for by any one of the nucleic acid 
sequences in the foregoing claims 32 through 40. 

20 76. A method for detecting S. pneumoniae in a sample, 
comprising the steps of: 

(a) obtaining proteins from a sample suspected 
of containing S. pneumoniae; 

25 

(b) binding said proteins with an antibody; 

(c) detecting the bound proteins. 
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77. The method of claim IS, wherein said antibody 
corresponds to an antibody directed against a Cps protein 

5 or peptide. 

78. The method of claim 77, wherein said antibody is 
labeled. 

10 

79. The method of claim 75, wherein said proteins are 
separated by electrophoresis. 

15 

80. A method for preventing infection of a subject with 
3, pneumoniae by administering a composition comprising 
an antibody directed against a Cps protein or peptide. 
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' Cpa3B 

GAGCT CGG TAT TTX TTG GAA CGT GAT TTA GTT CAT GTA GTT GCA AGT GAC ATG CAC AaT TTa 
ala axg ryr phe leu glu arg asp leu val hia val val ala aer asp met his asn leu 

GAC AGT AGA CCT CCA TAT ATg CAA CAG GCA TAT gAT ATc ATT GCT AA6 AAA TAT AGA GQQ 
asp aer arg pro pro tyr met; gib gin ala tyr asp ile lie ala lya lya zyr arg ala 

atop CpaSB 

AAA AAA GCO AAA 6AA CTT TTT GTA GAT AAT CCC AGA AAA ATT ATA ATG GAT CAT 7AA TTA 
lya lya ala lys glu leu phe val asp aan pro arg lya ile ile met asp hia *** 

atart Cpa3C 

GGA GAA AAT ATG AAG. GAA CAA AAC ACT TTG GAA aTC GAT GTA TTG CAG TAT Tea gaG CTT 

gly glu asn &at lys glu gin asn thr leu glu ile asp val leu gin tyr aer glu leu 

aTT GGa aGa aGT GtC aTT TTa TXa GTG GCa TTa TaC TTC TTC aGT TGC TTT TTc CTa C 
ile gly arg aer val ile leu leu val ala leu tyr phe phe aer cys phe phe leu 

^200 At gap in epa3C FIG. 6A 



cpjs3C (continued co Patl site. -*45 nt expected beyond P^tl to Cpa3C stop) 

ctt cog tat goo goG TTC CTg 
leu pro tyr ala ala phe leu 

caa aaa aTT aTC AGT ATT ACT 
gin lys ile ile aer ile c2>r 

ATC ACC gCT TcG CCA AAT aaT 
ile thr ala aer pro asn asn 

GGA ACT AGT GTc ATA GTT CTT 
gly thr aer val ile val leu 

P.ffrr 

GAT ATC GAA GAT ACA CTG CAG 
aap ile glu asp thr leu gin. 



•3 SO nt gap 

CAG CAc ttc tCA TCT GGC ACA GCT GAt ttA TCT CAC GGc ttA T6T GAt ACA AAT ATT gAA 
gin hia phe aer ser gly thr ala asp leu ser hia gly leu cys asp thr asn ile glu 

AAT TTA TXT GTA GTT CAA TCG GGA TCT GTA TCA CCA AAC CCT ACA GCC TTG Tea CAA AGc 
aan leu phe val val gin aer gly aer val aer pro asn pro thr ala leu ser gin aer 

Snd homology 

with CpalSfD Repeat 

AAA AAT TTT GTG GTT ATG GTA AAG cTT TTT TCA AAA GAG GTC AGT ATA TTG AGT TGG TGG 

lys asn phe 

AAA CGA TAA AAT ACA TAA TAT TTT ATT CCT TTG CTT ATC AAA TTA GCC CCT CCT GAA GCT 
CCC CAA TTG ACG GCT TGA GCT C 

FIG. 6C 



caa gac ccc aTC GGC Tgg CTC 
gin asp pro ile gly trp leu 

CGT GCT GAT GTg gca cAC TGG 
arg ala asp yal ala his txp 

AAA cGC AAT ACA CTA ATT GGT 
lys arg asn thr leu ile gly 

CTT CTT GAA CTT TTG GAC ACT 
leu leu glu leu leu asp thr 



TTT Gat aga GTa GC7 goc 
phe aap arg val ala ala 

AGG Age aag aec gee GAT 
axg ser lys thr ala aap 

TTT TTG Gca TTT TTT ATT 
phe leu ala phe phe ile 

CAT GTG AAA cGT CCG GAA 
his val lya arg pro glu 



BIG. 6B 
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FIG. 6D 



5AfCTeCAATC*A*6CCTCTTTeT*eACTTTTTCACA6«CCT*ATCCCTACACCACACC*AACC 
ATiUUaTeATCTWCCeCCTT*CeCTACS7eCCTCTceCCTACACAATC?ST7CACCCAAAAC fc?;»ACACTAeeeT7 ACTTATTCCJUUCTTAT< U >A* C * C fcCi^CACCCTAMAATCC J«0 

•,^v^c^g*gA«TTpf;#-«-AAAAteAAiiAj>ctTrmei«egtTAXtegeATCA^ l«6 

ACCAAtT«MTCCCTACACCAATCCeCCTAAtrmAnAAATCAATCAttCAATTCTKT??AST?ACAATATAC*AAM 413 

TCTTAACTCCTItTCTCAAAACCATCCCAfcfcTCeCATAAAmCCTCCAATTCCAATCACTCCTAATACTATTCATACACATCCTAnACAACTCW COO 

CTTTAAcfSACTCTACTACCCTTTTCCCkTACTCnCCACCTCATTCCTCACCACAAACTATCCAT?^^ 7,3 

ACCTCCTmAACCnJUTCnATCCAAAMTCTCTTTATATTTTCTAAAACTCATCCTATTTTCTAAAAAACACeTCACCTATCCATCCAA^ 140 

aATTAinCCMACA ff »6CTTAC6IUUtfTAAT«CfTJU^CC5»TATCTmCAAACCTCAT*C?^C^ f(« 

ACTAAACTAtl faCH 1 1 |A MTAAACtCAgAATAffAATAAT ftr ii r > ff »»» ffft C ae TCTACTAAAATCAAAAnCCCATTe C ACeAAeTCCTTATCTACeTCmrm IO»e 

ARlAtACS6rVCl.SLAVL - 

CTACCTCACCATCArCAACnAACCTMneATCmTAAACCATAACCTACACTCCATAAACAATAC^^ laOO 
OH«tW«VlOVtKO«VtSl NliaxSriitoCAlCJCTLVtCt - 

JD«lt TAA 

rTCAATcrTCMeecxccnAMTecfmeAcmrArAAAGAcercsAcrATecTAmrTeef Acrecc^ rcATsrACAerrAAATCAcnTMrACATencAcnau luo 

iTu U t AJfcOtAIIVYKOVlTAl IATtTlft6V0-.t.ll0r0TSSVC - 



FIG. 6E 



SCTCCTATCAACACneiATCCAAtATAATCATACTTCTAeAAteeTAATCAAAACTACTATTeCTaACeCTATACT* » rAAq C C i>CCeAAAA C TTTi>ATAeAeAtCCtATTATTTTT l«4« 
A A IKrCMCyVDTCrtVfKST IftGTTXtVStlFMrolllir - 

TCTCeACAgmCTAeCTCAATCOUACCmATATtJkTAATTrCTATCCArCT ISCO 

rrrLRSSKALTONLrPSKI VVCTOtOOJClTKaAMOrAO - 

▲ JWIJ 

CTAeTrAAACCTCa«eTArrAA6CAA C » ca rTCea^TA CIU.i IUUU.lg TTAAT C AA C CA C > CC T7CCAAfcATrcnTACT^^ IMS 

llCe6AllttSVriLVVA'riltACVAKL.rSMTtLAT«VAtrii • 

CAO^TACATACATATACCamAAAiyBCecmAt CC CI UWM nATTCAT At IHUI,! I A TeATCCTAeAATTCeATCACACTATAATAACecrA fcU HLUI lA CCSACCCTAr ItOO 

soTVSBvitVLiiVKTicstveTorKicsorMiirsrcTeeT - 



LTt e AAACCAACnTTACeC At < , r > LC I C AAAATCTCATTACA CCTLICl. lCCA AmAA TI WM> r w ri >>>li r > T T3CTOTAe^ ItlO 

LIASrKDVPCMI. X TAVVOSHltrilKOTSACAIL • 
TCA JDtll MOIf AAA tCt >» l • 

CCIAAACAACCTACTCTrCTAa:TAmATA«T»AATTATQWU^TCTCATTCTCATAAWT^^ JB40 

iiOf»vvcit»tiHKio«i>j|»^J*** e wii»at8itTCg»i • 

CTrATTTAC6AACCTACTATTCaUTOaitACTmATCC6ATACAC»cm Jl«6 
I TEiTllCPTrNCrilVt KS LOCrilllSSIVVAIIftHIIDOL 

A6eCATAfACAAfflU0>A» C TCTATACACCCttm AmCatA<UlC J( AT AhSC£SA**tAATTTTTAtCTATAC^ lltO IW/^ 

oiocittrTiioLrcAC' ^MrrrtLMUi»rrQ$tHorit - JPlCy. Oif 

gfTT AILI ItA H t i iUlCI i f A n tllAlRWi y iXt ecemTATATTTTOiTCCTCTakCATATAACTCCTAagTWAeW 24M 

r H i.prvrjfci««Avirri»AV«TiST»c«vto«Kvr«ivi - 

JUt«I.ICC I6CATCAACCACTTA Ait A A 1 1 IC AAACTCTACiaUTACAATTTCCACACATAAACCATeCaUUTTA t T UtUl iA TTAACCCC CC AAAAAii e CAai CTe TTetAAAA SSIO 
PVVOtrLMVrtSVLIIA'l SftMCPSCl tVVtUCPKIICIILVK 

▲jBMt A^ »** 

CTTTCTCATCATmAAtCAIUJJkTrACAAAATAATAtCACTCCAATTCAATCiyTTAey^ 2(40 

ACTCATAmcWTTCTACTACAtACTCATAeAeTATCCACCCCTACAAOCnWCTCACmc^ 

CTTCyiCCCTCAtKCTAAmCTOlCAATCmCCTAAeTTmACACCAAA lltO 

OPKIltfLVTNrAIII.LtCI «AtCTIIKA«fVTC«VCCI,PC«T - 

A i LH 1 1 lA CAAATATACTCCACAaWTmtACaUUtfmATA g AA ff A W C meATCaCArrTCATAACCAAe^^ MOO 

ArRIIIVCIIVtTlirXlCTrnCrilEtVI0O»flTaLT|.KKC - 

rAwSSIrreTTATBeACeATACnCTCnCtCTAlAIJk^ J"o 
KTVNQDTSVVTrOAPrsMXRriftOOLAMACCSOrNHLIM 

ACTeeTTCCATCATTACAA A T CD:CCTe n A t< .l U U lA mATTTTAeACATAtCATTTTACCTATCCTAerrAtTACCTT |C<.iCICAATAyAIlLt»«TLAAAATAtTAAATATA 3340 

HiA«APtiirriTrro«iLfHtti$rcviiifLi,Kitiii - 

ACTAaUTTCTTTATACAeeTTCATeCrCgCAAATTAmrAT Al L U L U 1 1 1. 1 I * nrCkTTmjGentUMMGAAAC^ J3W 

T IVTTAl WUtt lLTVttC«ir«rCe«llfBAII»^ll«l»tT» - 

TTtCtTAncCTCmnATWlTCCmtCWtAtAAmTCtCCCCTAmCC^ FIG. 6G 

AAATACTACTCeCTATATACACTAmAetagACrATrAAtTe AilUIUW I AAU AAtCr r AA C AAfc C AA C TKAAATATCACATTAtaCACBWetCeMCAA 1C«0 

Tr^NTAggrm i l i fti 1 1 1 n t^^-^xfrieeg^TTtyf trmix i. t 1 1 ici t nA AgccTCATaiimetcTtfTcecTO^crreju:^ iiio 

, _ ^ „ ^_ stare c*»*Jo _ _ _ 

emAAAQvrrAATOkmXTAmTmi^tltfAAATAgTCrAACS&ATTCTtAtCAAAyAC ^ ^ ICCWXiU:AaeCCTe c ec y ACai HI ICCLIGCCACT 9t40 

JfX«V < X AVIPAACL«rRrLrAt - 

irf UffTrr 

AAACCmCC C AA AACAAATCCTTCCAATTCtACACCCeCCOCAATTCAt ! I XCT W 
KAbAKtNLPlVOAPTtNrVt 

TCtATTCAACAmTTTTWTTCAACTTtTOUineCAATATAC^ « 

SltOTroSTrtLtt«LRKOCKll«tt«JVKtlT « W M f V » - 

cAAACTTCACCACCTCCTcTr c cTeAc a.iLatitu u i.m uL II 1 1 1 III I r rr Tr i rrMCL>i*.Li.uA ATCCTTyT6ATycprycywc^^ 4<oo 

OSSP HCLCDAVLOAKSrwCBO ¥!lLCft©tH»lT«lTAV - 

■■rTTTVrVHCft t^ATTt^TeCATCA^f JLCrUCf rMf *rn ffmT«*"g^J^^ ' tCl m iT A TCCTCreATTtCTeCTAeATTflCAAACT 4339 

?I.TROL«0OTIlATOA«TIAVHrv*TC3VtlT6v:ipALC» • 

ACTAATCCCCTCrATACTCTTCATqCTTTTCTACACAAACCAAAAIXAttACAACCCCCTA^^ 4448 

*fffh^ftftiTPAcrArrftrrTj^TcAMrrcMTTf^ffi:flr^**^*'^g^™** ^ **^' g '* ^ *^^^ *** * 

TOR?CACMtlQtT3At0tt HlltO»vrAA£rvCltat0VC0l . 

TtTAXTTTTATCAAAACATCAATTCATTATCCTCTTCAACArCCTCACAnAAACACACTTTAAAAAATTACCTTWTCCACT^^ 4«I0 

ACTCCACACCTATCAATtCTATACAAACTTATCAAAAATCCCTAAATCTCCCTCATCTTC 41 «0 
S C H L • 
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FIG. 6H 



FIG. 61 



CCaUU«lCA llU.iiliit AAAATTCTyeAtCATC»CA A I li:iU.II I 

o 



TACOSraeTO^ATOATATCAAAfeniteAAC 



ATTTM 



KOPROAXArOrDRT 



aCCtlCIU >CTTCAA TC CACAAACT nfr > n i r i>l> n a>AAA T CTTAeCTAAT 
AS02.ltCOTCASK2LJlK|.P 



VPPTrVQA 



CAtCC C AAAAA r TTTCCCCATATCCCcJUUieACIUUlTTCCTeACTTATCCCCATWT^ I lA CAATl 

D6KirrCOMAXBKlVTYC0BI*XDVliX.ADSODCLYIIPBBAHA. 



BrRXABLALOA 



ATTCCCAXTteArrTACArATCCCACYICACCACACACCA A CYACAAAACTTCAC CCCClUJ SAT^^ «040 
BCVOrr Z BI.0KrVOaTAYYXVQRVDJMX0 



TCCTTSCSAACIAAeTTYACCAeCTCAYAAYSTCATYATTeATAtCCAACAAeTA 
S LEV TLCASXYXXOXOaXiO 



K 0 B V * V 



ffATTACAT AI 1 1 iLHU UUi AICCieCIft CcaUlCACTCCCATyYA CMO 
ITrrAB«AA«B»»»L 



CCmrCACTCACCCCAACAYAXT VOlO 

TYLcrDseBPa 



SOBVCHCPDrABPITYbDZ XXPSVCSSY 

crAserectAAAAAAcrAecTCTATAYCACTAcauuuuiTtecTTACTOuxerccTCAYGACCCTACA^ list 

VAAXRVeLYSrBRtVYBACOCATDVRXRY-DRYAAAO A 



tXAUpOMt* - BE B3CC to 9iJ0* oppafits «ri«st«tian ItfT^ 
TMUWiccciAtTtccAAAYceACAA A te c r tm ic ATA6ATYtaiTA«eATceEfl e AC CT rAAceeCTeACAiCM ^ 1 1 1 lu i LMuuMO M:M\Sm OJ 

I»MAVrONBXSSX.XDrXSSSTI.SRBMNTYRZV XMB«r&SOB. 




ATCACCIT lei ICAAAC C CAAACAYCYACCrrATCrn I iCl ICAAAYTCY 
X Sri»QYO, YCRIfCr sSBS 
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